Xinyao Wu Blog

Thinking will not overcome fear but action will.

LLM Science Exam

A ML pipeline for building a Large Language Model

# This Python 3 environment comes with many helpful analytics libraries installed # It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python # For example, here's ...

Identifying age related conditions

Utilized

# This Python 3 environment comes with many helpful analytics libraries installed # It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python # For example, here's se...

The Evolution of Data Storage

Data Warehousing, Data Lake, and Data Lakehouse

Mind map of this article Introduction This article is inspired by a post written by a Databricks engineer. It is aimed at company engineers who use the Databricks ecosystem but are unclear about ...

Unleashing the Power of LLMs

Revolutionizing Natural Language with Advanced AI Models

What is an LLM? A Large Language Model (LLM) refers to a type of artificial intelligence model designed to understand and generate human-like text. These models are trained on vast amounts of text ...

Twitter Hateful Comments Detection

Utilized NLP, Bi-Directional LSTM to detect hateful comments to maintain Twitter environment.

Goal: Use TPUs to identify toxicity comments across multiple languages. import numpy as np import pandas as pd from tqdm import tqdm from sklearn.model_selection import train_test_split import ten...

Identify Fraudulent Activities

Use Random Forest to detect fraudulent activities for E-commerce websites.

Goal Build a machine learning model that predicts the probability that the first transaction of a new user is fraudulent. import numpy as np import pandas as pd import seaborn as sns import matplo...

Credit Card Fraud Detection

Use K-Means clustering to detect irregular credit card transaction.

Goal: Identify unusual/weird events that have a high chance of being a fraud with credit card transactions. import pandas as pd import numpy as np from sklearn.cluster import KMeans 1: Customer...

Funds Inflow Prediction (2014) for Ants Financial Services Group(AFSG)

Use ARIMA, STL to find the inflow finding trend for AFSG.

Goal: Predict the trend of Funds Inflow using historical time series data. import matplotlib.pyplot as plt import pandas as pd import numpy as np import statsmodels.api as sm from statsmodels.tsa....