805 lines
24 KiB
Plaintext
805 lines
24 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# yf_amazon 说明\n",
|
||
"0. **下载地址:** [百度网盘](https://pan.baidu.com/s/1SbfpZb5cm-g2LmnYV_af8Q)\n",
|
||
"1. **数据概览:** 52 万件商品,1100 多个类目,142 万用户,720 万条评论/评分数据\n",
|
||
"2. **推荐实验:** 推荐系统、情感/观点/评论 倾向性分析\n",
|
||
"2. **数据来源:** [亚马逊](https://www.amazon.cn/)\n",
|
||
"3. **原数据集:** [JD.com E-Commerce Data](http://yongfeng.me/dataset/),Yongfeng Zhang 教授为 WWW 2015 会议论文而搜集的数据\n",
|
||
"4. **加工处理:**\n",
|
||
" 1. 将全角字符转换为半角字符,并采用 UTF-8 编码\n",
|
||
" 2. 整理成与 [MovieLens](https://grouplens.org/datasets/movielens/) 兼容的格式\n",
|
||
" 3. 进行脱敏操作,以保护用户隐私"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import pandas as pd"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"path = 'yf_amazon_文件夹_所在_路径'"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 1. products.csv"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 加载数据"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"产品数目:525619\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"products = pd.read_csv(path + 'products.csv')\n",
|
||
"\n",
|
||
"print('产品数目:%d' % products.shape[0])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 字段说明\n",
|
||
"\n",
|
||
"| 字段 | 说明 |\n",
|
||
"| ---- | ---- |\n",
|
||
"| productId | 产品 id (从 0 开始,连续编号) |\n",
|
||
"| name | 产品名称 |\n",
|
||
"| catIds | 类别 id(从 0 开始,连续编号,从左到右依次表示一级类目、二级类目、三级类目) |"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>productId</th>\n",
|
||
" <th>name</th>\n",
|
||
" <th>catIds</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>331420</th>\n",
|
||
" <td>331420</td>\n",
|
||
" <td>欧意金狐狸 女式 皮手套 QT602</td>\n",
|
||
" <td>802,143,996</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>130945</th>\n",
|
||
" <td>130945</td>\n",
|
||
" <td>YESO TOT 中性 单肩包/斜挎包 均码 9411</td>\n",
|
||
" <td>1111,864,781</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>179886</th>\n",
|
||
" <td>179886</td>\n",
|
||
" <td>李斯特论柏辽兹与舒曼</td>\n",
|
||
" <td>832,552,337</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>504123</th>\n",
|
||
" <td>504123</td>\n",
|
||
" <td>Tuscarora 途斯卡洛拉 中性 烈焰驰骋无缝头巾 PSU3083</td>\n",
|
||
" <td>1111,522,720</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>387785</th>\n",
|
||
" <td>387785</td>\n",
|
||
" <td>我们的故事:一百个北大荒老知青的人生形态</td>\n",
|
||
" <td>832,519,599</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>406231</th>\n",
|
||
" <td>406231</td>\n",
|
||
" <td>图读周易</td>\n",
|
||
" <td>832,723,724</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>199072</th>\n",
|
||
" <td>199072</td>\n",
|
||
" <td>Barbie 芭比 女童 运动休闲鞋 A22993</td>\n",
|
||
" <td>802,777,601</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>518528</th>\n",
|
||
" <td>518528</td>\n",
|
||
" <td>HiVi 惠威 多媒体音箱 D1080MKII 2.0声道 棕色</td>\n",
|
||
" <td>1057,439,1064</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>446621</th>\n",
|
||
" <td>446621</td>\n",
|
||
" <td>HALTI 男式 JUOVAJACKET 芬兰国家队系列 羽绒滑雪服 H0591922</td>\n",
|
||
" <td>1111,651,693</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>379960</th>\n",
|
||
" <td>379960</td>\n",
|
||
" <td>塑料回收再生术:百工百技</td>\n",
|
||
" <td>832,1096,509</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" productId name catIds\n",
|
||
"331420 331420 欧意金狐狸 女式 皮手套 QT602 802,143,996\n",
|
||
"130945 130945 YESO TOT 中性 单肩包/斜挎包 均码 9411 1111,864,781\n",
|
||
"179886 179886 李斯特论柏辽兹与舒曼 832,552,337\n",
|
||
"504123 504123 Tuscarora 途斯卡洛拉 中性 烈焰驰骋无缝头巾 PSU3083 1111,522,720\n",
|
||
"387785 387785 我们的故事:一百个北大荒老知青的人生形态 832,519,599\n",
|
||
"406231 406231 图读周易 832,723,724\n",
|
||
"199072 199072 Barbie 芭比 女童 运动休闲鞋 A22993 802,777,601\n",
|
||
"518528 518528 HiVi 惠威 多媒体音箱 D1080MKII 2.0声道 棕色 1057,439,1064\n",
|
||
"446621 446621 HALTI 男式 JUOVAJACKET 芬兰国家队系列 羽绒滑雪服 H0591922 1111,651,693\n",
|
||
"379960 379960 塑料回收再生术:百工百技 832,1096,509"
|
||
]
|
||
},
|
||
"execution_count": 4,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"products.sample(10)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 2. categories.csv"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 加载数据"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"metadata": {
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"类别数目:1175\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"categories = pd.read_csv(path + 'categories.csv')\n",
|
||
"\n",
|
||
"print('类别数目:%d' % categories.shape[0])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 字段说明\n",
|
||
"\n",
|
||
"| 字段 | 说明 |\n",
|
||
"| ---- | ---- |\n",
|
||
"| catId | 类别 id (从 0 开始,连续编号) |\n",
|
||
"| category | 类别名称 |"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>catId</th>\n",
|
||
" <th>category</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>947</th>\n",
|
||
" <td>947</td>\n",
|
||
" <td>理发器</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>818</th>\n",
|
||
" <td>818</td>\n",
|
||
" <td>电脑硬件</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>212</th>\n",
|
||
" <td>212</td>\n",
|
||
" <td>帐篷</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>815</th>\n",
|
||
" <td>815</td>\n",
|
||
" <td>路由器/中继器</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>829</th>\n",
|
||
" <td>829</td>\n",
|
||
" <td>拉杆箱/包</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>391</th>\n",
|
||
" <td>391</td>\n",
|
||
" <td>女鞋</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>756</th>\n",
|
||
" <td>756</td>\n",
|
||
" <td>大型健身器械</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>11</th>\n",
|
||
" <td>11</td>\n",
|
||
" <td>其他运动器材</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>633</th>\n",
|
||
" <td>633</td>\n",
|
||
" <td>垂钓用品</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>115</th>\n",
|
||
" <td>115</td>\n",
|
||
" <td>卡通</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" catId category\n",
|
||
"947 947 理发器\n",
|
||
"818 818 电脑硬件\n",
|
||
"212 212 帐篷\n",
|
||
"815 815 路由器/中继器\n",
|
||
"829 829 拉杆箱/包\n",
|
||
"391 391 女鞋\n",
|
||
"756 756 大型健身器械\n",
|
||
"11 11 其他运动器材\n",
|
||
"633 633 垂钓用品\n",
|
||
"115 115 卡通"
|
||
]
|
||
},
|
||
"execution_count": 6,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"categories.sample(10)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 3. ratings.csv"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 加载数据"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"用户 数目:1424596\n",
|
||
"评分/评论 数目(总计):7202921\n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"pd_ratings = pd.read_csv(path+'ratings.csv')\n",
|
||
"\n",
|
||
"print('用户 数目:%d' % pd_ratings.userId.unique().shape[0])\n",
|
||
"print('评分/评论 数目(总计):%d\\n' % pd_ratings.shape[0])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 字段说明\n",
|
||
"\n",
|
||
"| 字段 | 说明 |\n",
|
||
"| ---- | ---- |\n",
|
||
"| userId | 用户 id (从 0 开始,连续编号) |\n",
|
||
"| productId | 即 products.csv 中的 productId |\n",
|
||
"| rating | 评分,[1,5] 之间的整数 |\n",
|
||
"| timestamp | 评分时间戳 |\n",
|
||
"| title | 评论的标题 |\n",
|
||
"| comment | 评论的内容 |"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"metadata": {
|
||
"scrolled": false
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>userId</th>\n",
|
||
" <th>productId</th>\n",
|
||
" <th>rating</th>\n",
|
||
" <th>timestamp</th>\n",
|
||
" <th>title</th>\n",
|
||
" <th>comment</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>4287636</th>\n",
|
||
" <td>230944.0</td>\n",
|
||
" <td>394505</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>1393084800</td>\n",
|
||
" <td>赞!</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3940838</th>\n",
|
||
" <td>16628.0</td>\n",
|
||
" <td>84789</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>1389715200</td>\n",
|
||
" <td>喜欢</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4064284</th>\n",
|
||
" <td>325829.0</td>\n",
|
||
" <td>94108</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>1384531200</td>\n",
|
||
" <td>磨脚</td>\n",
|
||
" <td>右脚小脚趾磨掉一块皮</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4802616</th>\n",
|
||
" <td>586385.0</td>\n",
|
||
" <td>254002</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>1383408000</td>\n",
|
||
" <td>哦~</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>292946</th>\n",
|
||
" <td>842028.0</td>\n",
|
||
" <td>231449</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>1369324800</td>\n",
|
||
" <td>致我们终将逝去的青春</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2306551</th>\n",
|
||
" <td>933226.0</td>\n",
|
||
" <td>219015</td>\n",
|
||
" <td>4.0</td>\n",
|
||
" <td>1341763200</td>\n",
|
||
" <td>有点大 不过很漂亮</td>\n",
|
||
" <td>外观很精致的说 就是外形有点偏大</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1707442</th>\n",
|
||
" <td>402851.0</td>\n",
|
||
" <td>228321</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>1374076800</td>\n",
|
||
" <td>给宝宝讲讲挺好的,内容简单,便于宝宝理解。</td>\n",
|
||
" <td>给宝宝讲讲挺好的,内容简单,便于宝宝理解。</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3641724</th>\n",
|
||
" <td>123473.0</td>\n",
|
||
" <td>515623</td>\n",
|
||
" <td>4.0</td>\n",
|
||
" <td>1305475200</td>\n",
|
||
" <td>书很好,但居然没有包装!?!?!?</td>\n",
|
||
" <td>书很好,但居然没有包装!?!?!?这么好的书却没有包装!?!?!?</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1921912</th>\n",
|
||
" <td>435946.0</td>\n",
|
||
" <td>63238</td>\n",
|
||
" <td>4.0</td>\n",
|
||
" <td>1357228800</td>\n",
|
||
" <td>嗯</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1475151</th>\n",
|
||
" <td>1612.0</td>\n",
|
||
" <td>139044</td>\n",
|
||
" <td>4.0</td>\n",
|
||
" <td>1316102400</td>\n",
|
||
" <td>一般</td>\n",
|
||
" <td>香味没有前面评价那么香,就是普通的爽肤水,有点黏黏的</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" userId productId rating timestamp title \\\n",
|
||
"4287636 230944.0 394505 5.0 1393084800 赞! \n",
|
||
"3940838 16628.0 84789 5.0 1389715200 喜欢 \n",
|
||
"4064284 325829.0 94108 3.0 1384531200 磨脚 \n",
|
||
"4802616 586385.0 254002 5.0 1383408000 哦~ \n",
|
||
"292946 842028.0 231449 5.0 1369324800 致我们终将逝去的青春 \n",
|
||
"2306551 933226.0 219015 4.0 1341763200 有点大 不过很漂亮 \n",
|
||
"1707442 402851.0 228321 5.0 1374076800 给宝宝讲讲挺好的,内容简单,便于宝宝理解。 \n",
|
||
"3641724 123473.0 515623 4.0 1305475200 书很好,但居然没有包装!?!?!? \n",
|
||
"1921912 435946.0 63238 4.0 1357228800 嗯 \n",
|
||
"1475151 1612.0 139044 4.0 1316102400 一般 \n",
|
||
"\n",
|
||
" comment \n",
|
||
"4287636 NaN \n",
|
||
"3940838 NaN \n",
|
||
"4064284 右脚小脚趾磨掉一块皮 \n",
|
||
"4802616 NaN \n",
|
||
"292946 NaN \n",
|
||
"2306551 外观很精致的说 就是外形有点偏大 \n",
|
||
"1707442 给宝宝讲讲挺好的,内容简单,便于宝宝理解。 \n",
|
||
"3641724 书很好,但居然没有包装!?!?!?这么好的书却没有包装!?!?!? \n",
|
||
"1921912 NaN \n",
|
||
"1475151 香味没有前面评价那么香,就是普通的爽肤水,有点黏黏的 "
|
||
]
|
||
},
|
||
"execution_count": 8,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"pd_ratings.sample(10)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# 4. links.csv"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 加载数据"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"links = pd.read_csv(path + 'links.csv')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 字段说明\n",
|
||
"\n",
|
||
"| 字段 | 说明 |\n",
|
||
"| ---- | ---- |\n",
|
||
"| productId | 即 products.csv 和 ratings.csv 中的 productId |\n",
|
||
"| amazonId | 亚马逊的产品编号 |"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>productId</th>\n",
|
||
" <th>amazonId</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>436251</th>\n",
|
||
" <td>436251</td>\n",
|
||
" <td>B00F91KYGK</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>194578</th>\n",
|
||
" <td>194578</td>\n",
|
||
" <td>B00GICSVUK</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>336998</th>\n",
|
||
" <td>336998</td>\n",
|
||
" <td>B00GMKUNBI</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>371924</th>\n",
|
||
" <td>371924</td>\n",
|
||
" <td>B008RIA4AS</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>433617</th>\n",
|
||
" <td>433617</td>\n",
|
||
" <td>B00332FJ7Q</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>236918</th>\n",
|
||
" <td>236918</td>\n",
|
||
" <td>060614479X</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>388158</th>\n",
|
||
" <td>388158</td>\n",
|
||
" <td>B008TI5V2C</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>479855</th>\n",
|
||
" <td>479855</td>\n",
|
||
" <td>B002NSML6I</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>311842</th>\n",
|
||
" <td>311842</td>\n",
|
||
" <td>B001DTWV2C</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>445227</th>\n",
|
||
" <td>445227</td>\n",
|
||
" <td>B0055PT83U</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>360465</th>\n",
|
||
" <td>360465</td>\n",
|
||
" <td>B005UTT2QY</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>258363</th>\n",
|
||
" <td>258363</td>\n",
|
||
" <td>0805092919</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>308642</th>\n",
|
||
" <td>308642</td>\n",
|
||
" <td>B0079WMXT8</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>232740</th>\n",
|
||
" <td>232740</td>\n",
|
||
" <td>B0018HKRAW</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>335318</th>\n",
|
||
" <td>335318</td>\n",
|
||
" <td>B00840LWKU</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>497048</th>\n",
|
||
" <td>497048</td>\n",
|
||
" <td>B003ZI61RA</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>388969</th>\n",
|
||
" <td>388969</td>\n",
|
||
" <td>B00BIUYL06</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10448</th>\n",
|
||
" <td>10448</td>\n",
|
||
" <td>B00GMZ9DKK</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>75752</th>\n",
|
||
" <td>75752</td>\n",
|
||
" <td>B002R0DNB4</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>392345</th>\n",
|
||
" <td>392345</td>\n",
|
||
" <td>B0041IY7CE</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" productId amazonId\n",
|
||
"436251 436251 B00F91KYGK\n",
|
||
"194578 194578 B00GICSVUK\n",
|
||
"336998 336998 B00GMKUNBI\n",
|
||
"371924 371924 B008RIA4AS\n",
|
||
"433617 433617 B00332FJ7Q\n",
|
||
"236918 236918 060614479X\n",
|
||
"388158 388158 B008TI5V2C\n",
|
||
"479855 479855 B002NSML6I\n",
|
||
"311842 311842 B001DTWV2C\n",
|
||
"445227 445227 B0055PT83U\n",
|
||
"360465 360465 B005UTT2QY\n",
|
||
"258363 258363 0805092919\n",
|
||
"308642 308642 B0079WMXT8\n",
|
||
"232740 232740 B0018HKRAW\n",
|
||
"335318 335318 B00840LWKU\n",
|
||
"497048 497048 B003ZI61RA\n",
|
||
"388969 388969 B00BIUYL06\n",
|
||
"10448 10448 B00GMZ9DKK\n",
|
||
"75752 75752 B002R0DNB4\n",
|
||
"392345 392345 B0041IY7CE"
|
||
]
|
||
},
|
||
"execution_count": 10,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"links.sample(20)"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.6.5"
|
||
},
|
||
"widgets": {
|
||
"state": {},
|
||
"version": "1.1.2"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 2
|
||
}
|