Autonomous Alpha Factor Research for Chinese A-Shares
AI agents invent, iterate, and optimize quantitative factors guided by Pareto frontier optimization. ~60 experiments per hour, zero human intervention while you sleep.
RankICIC IRTurnoverPareto Front12 Operators
System Architecture
Agent edits
factors.py
Write 1–10 Factor subclasses per experiment with 12 built-in operators
Non-dominated factors · git-tracked · permanent research artifact
factors dominant on ≥1 metric
Three Metrics
First-Principles Evaluation
A factor must predict returns, do so consistently, and be cheap to trade. These three dimensions cannot be maximized simultaneously — the agent discovers the Pareto frontier.
Metric 1 — Predictive Power
RankIC
mean( Spearman(factor, fwd_return) )
Cross-sectional daily Spearman correlation. Range [-1,1]. Uses absolute value for Pareto comparison. Computed over full 2020–2025 period.
Metric 2 — Stability
IC IR
mean(IC) / std(IC)
Information Ratio of daily IC values. Signal-to-noise. IC=0.05 with std=0.10 → IR=0.5 (noisy). IC=0.04 with std=0.04 → IR=1.0 (reliable).
Metric 3 — Tradeability
Turnover
1 − mean(|rank_t − rank_{t−1}|)
Day-over-day ranking stability. 1.0 = zero trading cost (identical ranks daily). 0.5 = half of stocks change rank percentile significantly each day.
Experiment Results
53 Experiments · 48 Factors Evaluated · 0 Crashes
30+ iterations generated 48 factor candidates across 14 categories. All successfully evaluated with zero crashes. The Pareto frontier now contains 19 unique non-dominated factor types (23 entries total). Latest experiment: 10 new factors pushed frontier into high-IC / high-turnover region.
The agent writes Factor subclasses in factors.py. Each class is automatically discovered, evaluated against 495 A-shares over 5 years, and checked against the Pareto frontier.
factors.py
from prepare import Factor, ops
classFactor001(Factor):
name = "momentum_5d"defcompute(self, df):
m = df.set_index(["datetime", "symbol"])
val = ops.cs_rank(m["close"] - ops.delay(m["close"], 5))
return Factor.as_cs_series(df, val)