737 lines
24 KiB
Plaintext
737 lines
24 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# yf_dianping 说明\n",
|
||
"0. **下载地址:** [百度网盘](https://pan.baidu.com/s/1yMNvHLl6QYsGbjT7u51Nfg)\n",
|
||
"1. **数据概览:** 24 万家餐馆,54 万用户,440 万条评论/评分数据\n",
|
||
"2. **推荐实验:** 推荐系统、情感/观点/评论 倾向性分析\n",
|
||
"2. **数据来源:** [大众点评](http://www.dianping.com/)\n",
|
||
"3. **原数据集:** [Dianping Review Dataset](http://yongfeng.me/dataset/),Yongfeng Zhang 教授为 WWW 2013, SIGIR 2013, SIGIR 2014 会议论文而搜集的数据\n",
|
||
"4. **加工处理:**\n",
|
||
" 1. 只保留原数据集中的评论、评分等信息,去除其他无用信息\n",
|
||
" 2. 整理成与 [MovieLens](https://grouplens.org/datasets/movielens/) 兼容的格式\n",
|
||
" 3. 进行脱敏操作,以保护用户隐私"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 79,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import pandas as pd"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 80,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"path = 'yf_dianping_文件夹_所在_路径'"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 1. restaurants.csv"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 加载数据"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 81,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"餐馆数目(有名称):209132\n",
|
||
"餐馆数目(没有名称):34115\n",
|
||
"餐馆数目(总计):243247\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"restaurants = pd.read_csv(path + 'restaurants.csv')\n",
|
||
"\n",
|
||
"print('餐馆数目(有名称):%d' % restaurants[~pd.isnull(restaurants.name)].shape[0])\n",
|
||
"print('餐馆数目(没有名称):%d' % restaurants[pd.isnull(restaurants.name)].shape[0])\n",
|
||
"print('餐馆数目(总计):%d' % restaurants.shape[0])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 字段说明\n",
|
||
"\n",
|
||
"| 字段 | 说明 |\n",
|
||
"| ---- | ---- |\n",
|
||
"| restId | 餐馆 id (从 0 开始,连续编号) |\n",
|
||
"| name | 餐馆名称 |"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 82,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>restId</th>\n",
|
||
" <th>name</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>210902</th>\n",
|
||
" <td>210902</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>124832</th>\n",
|
||
" <td>124832</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>26766</th>\n",
|
||
" <td>26766</td>\n",
|
||
" <td>香锅制造(新苏天地店)</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>91754</th>\n",
|
||
" <td>91754</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>204465</th>\n",
|
||
" <td>204465</td>\n",
|
||
" <td>西部牛扒城(湖塘店)</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>36475</th>\n",
|
||
" <td>36475</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>231861</th>\n",
|
||
" <td>231861</td>\n",
|
||
" <td>四季火锅</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>79816</th>\n",
|
||
" <td>79816</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>140694</th>\n",
|
||
" <td>140694</td>\n",
|
||
" <td>彝家牛汤锅</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>169641</th>\n",
|
||
" <td>169641</td>\n",
|
||
" <td>春秋</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>33809</th>\n",
|
||
" <td>33809</td>\n",
|
||
" <td>九头鸟酒家(永定门店)</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>236919</th>\n",
|
||
" <td>236919</td>\n",
|
||
" <td>老上海城隍庙小吃(人民大学店)</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>182387</th>\n",
|
||
" <td>182387</td>\n",
|
||
" <td>河源三家村酒楼</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>140475</th>\n",
|
||
" <td>140475</td>\n",
|
||
" <td>荣记麻辣烫</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>194224</th>\n",
|
||
" <td>194224</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>152406</th>\n",
|
||
" <td>152406</td>\n",
|
||
" <td>鼎丰真(东四马路店)</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>11701</th>\n",
|
||
" <td>11701</td>\n",
|
||
" <td>南亚餐厅</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>58805</th>\n",
|
||
" <td>58805</td>\n",
|
||
" <td>益丰坊(虎泉店)</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>15641</th>\n",
|
||
" <td>15641</td>\n",
|
||
" <td>万达艾美酒店大堂吧</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>43424</th>\n",
|
||
" <td>43424</td>\n",
|
||
" <td>新美心绿姿生活</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" restId name\n",
|
||
"210902 210902 NaN\n",
|
||
"124832 124832 NaN\n",
|
||
"26766 26766 香锅制造(新苏天地店)\n",
|
||
"91754 91754 NaN\n",
|
||
"204465 204465 西部牛扒城(湖塘店)\n",
|
||
"36475 36475 NaN\n",
|
||
"231861 231861 四季火锅\n",
|
||
"79816 79816 NaN\n",
|
||
"140694 140694 彝家牛汤锅\n",
|
||
"169641 169641 春秋\n",
|
||
"33809 33809 九头鸟酒家(永定门店)\n",
|
||
"236919 236919 老上海城隍庙小吃(人民大学店)\n",
|
||
"182387 182387 河源三家村酒楼\n",
|
||
"140475 140475 荣记麻辣烫\n",
|
||
"194224 194224 NaN\n",
|
||
"152406 152406 鼎丰真(东四马路店)\n",
|
||
"11701 11701 南亚餐厅\n",
|
||
"58805 58805 益丰坊(虎泉店)\n",
|
||
"15641 15641 万达艾美酒店大堂吧\n",
|
||
"43424 43424 新美心绿姿生活"
|
||
]
|
||
},
|
||
"execution_count": 82,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"restaurants.sample(20)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 2. ratings.csv"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 加载数据"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 89,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"用户 数目:542706\n",
|
||
"评分/评论 数目(总计):4422473\n",
|
||
"\n",
|
||
"总体 评分 数目([1,5]):3293878\n",
|
||
"环境 评分 数目([1,5]):4076220\n",
|
||
"口味 评分 数目([1,5]):4093819\n",
|
||
"服务 评分 数目([1,5]):4076220\n",
|
||
"评论 数目:4107409\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"pd_ratings = pd.read_csv(path+'ratings.csv')\n",
|
||
"\n",
|
||
"print('用户 数目:%d' % pd_ratings.userId.unique().shape[0])\n",
|
||
"print('评分/评论 数目(总计):%d\\n' % pd_ratings.shape[0])\n",
|
||
"\n",
|
||
"print('总体 评分 数目([1,5]):%d' % pd_ratings[(pd_ratings.rating>=1) & (pd_ratings.rating<=5)].shape[0])\n",
|
||
"print('环境 评分 数目([1,5]):%d' % pd_ratings[(pd_ratings.rating_env>=1) & (pd_ratings.rating_env<=5)].shape[0])\n",
|
||
"print('口味 评分 数目([1,5]):%d' % pd_ratings[(pd_ratings.rating_flavor>=1) & (pd_ratings.rating_flavor<=5)].shape[0])\n",
|
||
"print('服务 评分 数目([1,5]):%d' % pd_ratings[(pd_ratings.rating_service>=1) & (pd_ratings.rating_service<=5)].shape[0])\n",
|
||
"print('评论 数目:%d' % pd_ratings[~pd_ratings.comment.isna()].shape[0])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 字段说明\n",
|
||
"\n",
|
||
"| 字段 | 说明 |\n",
|
||
"| ---- | ---- |\n",
|
||
"| userId | 用户 id (从 0 开始,连续编号) |\n",
|
||
"| restId | 即 restaurants.csv 中的 restId |\n",
|
||
"| rating | 总体评分,[0,5] 之间的整数 |\n",
|
||
"| rating_env | 环境评分,[1,5] 之间的整数 |\n",
|
||
"| rating_flavor | 口味评分,[1,5] 之间的整数 |\n",
|
||
"| rating_service | 服务评分,[1,5] 之间的整数 |\n",
|
||
"| timestamp | 评分时间戳 |\n",
|
||
"| comment | 评论内容 |"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 84,
|
||
"metadata": {
|
||
"scrolled": false
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>userId</th>\n",
|
||
" <th>restId</th>\n",
|
||
" <th>rating</th>\n",
|
||
" <th>rating_env</th>\n",
|
||
" <th>rating_flavor</th>\n",
|
||
" <th>rating_service</th>\n",
|
||
" <th>timestamp</th>\n",
|
||
" <th>comment</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>3331708</th>\n",
|
||
" <td>6802</td>\n",
|
||
" <td>183728</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>4.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>1315673880000</td>\n",
|
||
" <td>环境不错,停车方便,交通也比较方便,东西齐全,应有尽有,吃、喝、玩、乐样样齐全,还有个五星级...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3332473</th>\n",
|
||
" <td>3106</td>\n",
|
||
" <td>183750</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>4.0</td>\n",
|
||
" <td>4.0</td>\n",
|
||
" <td>4.0</td>\n",
|
||
" <td>1260155880000</td>\n",
|
||
" <td>去过两次,都是由日本朋友带着去的,很喜欢那种在小巷子深处的店,总觉得那样的店料理会很好吃。最...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>291609</th>\n",
|
||
" <td>39590</td>\n",
|
||
" <td>13570</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>1324792500000</td>\n",
|
||
" <td>朋友请客,两个人中午去吃的,虽然不是节假日,但人还是非常的多,等了很长时间才上餐,价位偏高,...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>749582</th>\n",
|
||
" <td>59192</td>\n",
|
||
" <td>38519</td>\n",
|
||
" <td>4.0</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" <td>1321430760000</td>\n",
|
||
" <td>十一长假之前,我们的房子终于有了好消息,这个月底就可以拿到钥匙,真是不容易,盼星星盼月亮的,...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>719908</th>\n",
|
||
" <td>241643</td>\n",
|
||
" <td>36382</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>1271862180000</td>\n",
|
||
" <td>很差的一家店!公司聚餐居然选在这里,真是个大大的失策!\\n点的菜迟迟不上,不知道是故意不上还...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3127953</th>\n",
|
||
" <td>12481</td>\n",
|
||
" <td>173459</td>\n",
|
||
" <td>4.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>1300407540000</td>\n",
|
||
" <td>这家是离家最近的一家城市超市了,所以自然要进去随便逛逛啦。\\n因为附近是居民区,自然光顾的主...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2068253</th>\n",
|
||
" <td>13070</td>\n",
|
||
" <td>115853</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" <td>1308671820000</td>\n",
|
||
" <td>以前觉得还行,但有了85度之后就不行了。要了个提拉米苏,不行,太甜了。\\n辣松的味道倒不错,...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>640356</th>\n",
|
||
" <td>168006</td>\n",
|
||
" <td>33263</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>1224868560000</td>\n",
|
||
" <td>算比较地道的川菜了 味道辣的很正 强力推荐 据说还是标点美食的... 香辣鸡翅每去必点~!不...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1222261</th>\n",
|
||
" <td>76280</td>\n",
|
||
" <td>65171</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" <td>1302136740000</td>\n",
|
||
" <td>为什么这么多人说好吃啊?为什么这么多人说肉多啊?难道是我人品有问题?\\n这个也是慕名而去的~...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>101366</th>\n",
|
||
" <td>67372</td>\n",
|
||
" <td>2853</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>1283741400000</td>\n",
|
||
" <td>两年前经常去这家吃卤煮,感觉特别好吃,可是最近吃了一次,让我大失所望。。。\\n卤煮的汤和食材...</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" userId restId rating rating_env rating_flavor rating_service \\\n",
|
||
"3331708 6802 183728 3.0 3.0 4.0 3.0 \n",
|
||
"3332473 3106 183750 5.0 4.0 4.0 4.0 \n",
|
||
"291609 39590 13570 3.0 3.0 2.0 3.0 \n",
|
||
"749582 59192 38519 4.0 2.0 3.0 2.0 \n",
|
||
"719908 241643 36382 1.0 2.0 1.0 1.0 \n",
|
||
"3127953 12481 173459 4.0 3.0 3.0 3.0 \n",
|
||
"2068253 13070 115853 3.0 3.0 3.0 2.0 \n",
|
||
"640356 168006 33263 NaN 3.0 5.0 3.0 \n",
|
||
"1222261 76280 65171 3.0 2.0 2.0 2.0 \n",
|
||
"101366 67372 2853 1.0 1.0 1.0 1.0 \n",
|
||
"\n",
|
||
" timestamp comment \n",
|
||
"3331708 1315673880000 环境不错,停车方便,交通也比较方便,东西齐全,应有尽有,吃、喝、玩、乐样样齐全,还有个五星级... \n",
|
||
"3332473 1260155880000 去过两次,都是由日本朋友带着去的,很喜欢那种在小巷子深处的店,总觉得那样的店料理会很好吃。最... \n",
|
||
"291609 1324792500000 朋友请客,两个人中午去吃的,虽然不是节假日,但人还是非常的多,等了很长时间才上餐,价位偏高,... \n",
|
||
"749582 1321430760000 十一长假之前,我们的房子终于有了好消息,这个月底就可以拿到钥匙,真是不容易,盼星星盼月亮的,... \n",
|
||
"719908 1271862180000 很差的一家店!公司聚餐居然选在这里,真是个大大的失策!\\n点的菜迟迟不上,不知道是故意不上还... \n",
|
||
"3127953 1300407540000 这家是离家最近的一家城市超市了,所以自然要进去随便逛逛啦。\\n因为附近是居民区,自然光顾的主... \n",
|
||
"2068253 1308671820000 以前觉得还行,但有了85度之后就不行了。要了个提拉米苏,不行,太甜了。\\n辣松的味道倒不错,... \n",
|
||
"640356 1224868560000 算比较地道的川菜了 味道辣的很正 强力推荐 据说还是标点美食的... 香辣鸡翅每去必点~!不... \n",
|
||
"1222261 1302136740000 为什么这么多人说好吃啊?为什么这么多人说肉多啊?难道是我人品有问题?\\n这个也是慕名而去的~... \n",
|
||
"101366 1283741400000 两年前经常去这家吃卤煮,感觉特别好吃,可是最近吃了一次,让我大失所望。。。\\n卤煮的汤和食材... "
|
||
]
|
||
},
|
||
"execution_count": 84,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd_ratings.sample(10)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 3. links.csv"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 加载数据"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 85,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"links = pd.read_csv(path + 'links.csv')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 字段说明\n",
|
||
"\n",
|
||
"| 字段 | 说明 |\n",
|
||
"| ---- | ---- |\n",
|
||
"| restId | 即 restaurants.csv 和 ratings.csv 中的 restId |\n",
|
||
"| dianpingId | 大众点评网的餐馆编号 |"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 86,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>restId</th>\n",
|
||
" <th>dianpingId</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>138492</th>\n",
|
||
" <td>138492</td>\n",
|
||
" <td>3566359</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>158007</th>\n",
|
||
" <td>158007</td>\n",
|
||
" <td>2484433</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>16170</th>\n",
|
||
" <td>16170</td>\n",
|
||
" <td>3651451</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>116637</th>\n",
|
||
" <td>116637</td>\n",
|
||
" <td>5143029</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>191554</th>\n",
|
||
" <td>191554</td>\n",
|
||
" <td>2734621</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>192481</th>\n",
|
||
" <td>192481</td>\n",
|
||
" <td>3000367</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>40978</th>\n",
|
||
" <td>40978</td>\n",
|
||
" <td>3168181</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>196832</th>\n",
|
||
" <td>196832</td>\n",
|
||
" <td>3523291</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6048</th>\n",
|
||
" <td>6048</td>\n",
|
||
" <td>2435827</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>200405</th>\n",
|
||
" <td>200405</td>\n",
|
||
" <td>4130573</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>69792</th>\n",
|
||
" <td>69792</td>\n",
|
||
" <td>2853502</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>153075</th>\n",
|
||
" <td>153075</td>\n",
|
||
" <td>2000257</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8528</th>\n",
|
||
" <td>8528</td>\n",
|
||
" <td>2651221</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>196930</th>\n",
|
||
" <td>196930</td>\n",
|
||
" <td>3534673</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>224063</th>\n",
|
||
" <td>224063</td>\n",
|
||
" <td>3138160</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3434</th>\n",
|
||
" <td>3434</td>\n",
|
||
" <td>2185753</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>125490</th>\n",
|
||
" <td>125490</td>\n",
|
||
" <td>2112511</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>230533</th>\n",
|
||
" <td>230533</td>\n",
|
||
" <td>4122445</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>130597</th>\n",
|
||
" <td>130597</td>\n",
|
||
" <td>2632129</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>186956</th>\n",
|
||
" <td>186956</td>\n",
|
||
" <td>2233513</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" restId dianpingId\n",
|
||
"138492 138492 3566359\n",
|
||
"158007 158007 2484433\n",
|
||
"16170 16170 3651451\n",
|
||
"116637 116637 5143029\n",
|
||
"191554 191554 2734621\n",
|
||
"192481 192481 3000367\n",
|
||
"40978 40978 3168181\n",
|
||
"196832 196832 3523291\n",
|
||
"6048 6048 2435827\n",
|
||
"200405 200405 4130573\n",
|
||
"69792 69792 2853502\n",
|
||
"153075 153075 2000257\n",
|
||
"8528 8528 2651221\n",
|
||
"196930 196930 3534673\n",
|
||
"224063 224063 3138160\n",
|
||
"3434 3434 2185753\n",
|
||
"125490 125490 2112511\n",
|
||
"230533 230533 4122445\n",
|
||
"130597 130597 2632129\n",
|
||
"186956 186956 2233513"
|
||
]
|
||
},
|
||
"execution_count": 86,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"links.sample(20)"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.6.4"
|
||
},
|
||
"widgets": {
|
||
"state": {},
|
||
"version": "1.1.2"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 2
|
||
}
|