🔍 为什么要筛选数据?
量化中经常需要:
- 📈 找到涨幅超过5%的股票
- 💰 找到价格低于10元的股票
- 📊 找到最近30天的数据
Pandas 帮我们轻松搞定!
👨💻 准备数据
import pandas as pd
df = pd.DataFrame({
"股票": ["茅台", "五粮液", "海螺", "平安", "格力"],
"代码": ["600519", "000858", "600585", "601318", "000651"],
"价格": [1800, 200, 30, 50, 35],
"涨跌幅": [1.5, -0.8, 2.1, 0.5, -1.2],
"市值": ["大盘", "中盘", "中盘", "大盘", "中盘"]
})
print(df)
🎯 单条件筛选
# 涨的股票
up = df[df["涨跌幅"] > 0]
print(up)
# 跌的股票
down = df[df["涨跌幅"]
# 价格大于100
expensive = df[df["价格"] > 100]
print(expensive)
🔗 多条件筛选
# 并且 (and) - 用 &
big_up = df[(df["涨跌幅"] > 1) & (df["价格"] > 100)]
print(big_up)
# 或者 (or) - 用 |
cheap_or_up = df[(df["价格"] 1)]
print(cheap_or_up)
# 排除某些值
not_down = df[df["涨跌幅"] != -0.8]
print(not_down)
📋 在列表中查找
# 市值是"大盘"的
big_market = df[df["市值"].isin(["大盘"])]
print(big_market)
# 代码是这些的
codes = ["600519", "601318"]
selected = df[df["代码"].isin(codes)]
print(selected)
🔎 字符串筛选
# 代码以6开头的
start_with_6 = df[df["代码"].str.startswith("6")]
print(start_with_6)
# 股票名包含某个字的
contains_m = df[df["股票"].str.contains("米|", regex=True)]
💡 量化实战:选股条件
import pandas as pd
# 假设这是所有股票数据
stocks = pd.DataFrame({
"代码": ["600519", "000858", "600585"],
"价格": [1800, 200, 30],
"涨跌幅": [1.5, -0.8, 2.1],
"成交量": [100000, 50000, 80000]
})
# 选股条件:
# 1. 涨幅大于0
# 2. 成交量大于50000
selected = stocks[(stocks["涨跌幅"] > 0) & (stocks["成交量"] > 50000)]
print("符合条件的股票:")
print(selected)
📚 下一课
学会了筛选,我们来学可视化——把数据画成图!