History

caojiewen da60f433f1 removed the useless link of apply form		4 years ago
..
scripts	fix deepfm gpu train script	4 years ago
src	Change GatherV2 to Gather r1.1 to master	4 years ago
README.md	removed the useless link of apply form	4 years ago
README_CN.md	removed the useless link of apply form	4 years ago
eval.py	support CPU deepfm	4 years ago
export.py	fix GPU device_id bug	4 years ago
mindspore_hub_conf.py	add yolov3_resnet18&yolov3_darknet53&yolov3_darknet53_quant&deepfm hub conf files	4 years ago
train.py	fix the linear ratio of vgg16, deepfm and wide_deep	4 years ago

README.md

Unescape Escape

DeepFM Description
Model Architecture
Dataset
Environment Requirements
Quick Start
Script Description
Model Description
- Performance
  - Evaluation Performance
  - Inference Performance
Description of Random Situation
ModelZoo Homepage

DeepFM Description

Learning sophisticated feature interactions behind user behaviors is critical in maximizing CTR for recommender systems. Despite great progress, existing methods seem to have a strong bias towards low- or high-order interactions, or require expertise feature engineering. In this paper, we show that it is possible to derive an end-to-end learning model that emphasizes both low- and high-order feature interactions. The proposed model, DeepFM, combines the power of factorization machines for recommendation and deep learning for feature learning in a new neural network architecture.

Paper: Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, Xiuqiang He. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction

Model Architecture

DeepFM consists of two components. The FM component is a factorization machine, which is proposed in to learn feature interactions for recommendation. The deep component is a feed-forward neural network, which is used to learn high-order feature interactions. The FM and deep component share the same input raw feature vector, which enables DeepFM to learn low- and high-order feature interactions simultaneously from the input raw features.

Dataset

[1] A dataset used in Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, Xiuqiang He. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction[J]. 2017.

Environment Requirements

Hardware（Ascend/GPU/CPU）
- Prepare hardware environment with Ascend, GPU, or CPU processor.
Framework
- MindSpore
For more information, please check the resources below：
- MindSpore Tutorials
- MindSpore Python API

Quick Start

After installing MindSpore via the official website, you can start training and evaluation as follows:

running on Ascend

# run training example
python train.py \
  --dataset_path='dataset/train' \
  --ckpt_path='./checkpoint' \
  --eval_file_name='auc.log' \
  --loss_file_name='loss.log' \
  --device_target='Ascend' \
  --do_eval=True > ms_log/output.log 2>&1 &

# run distributed training example
sh scripts/run_distribute_train.sh 8 /dataset_path /rank_table_8p.json

# run evaluation example
python eval.py \
  --dataset_path='dataset/test' \
  --checkpoint_path='./checkpoint/deepfm.ckpt' \
  --device_target='Ascend' > ms_log/eval_output.log 2>&1 &
OR
sh scripts/run_eval.sh 0 Ascend /dataset_path /checkpoint_path/deepfm.ckpt

For distributed training, a hccl configuration file with JSON format needs to be created in advance.

Please follow the instructions in the link below:

hccl tools.

running on GPU

For running on GPU, please change device_target from Ascend to GPU in configuration file src/config.py

# run training example
python train.py \
  --dataset_path='dataset/train' \
  --ckpt_path='./checkpoint' \
  --eval_file_name='auc.log' \
  --loss_file_name='loss.log' \
  --device_target='GPU' \
  --do_eval=True > ms_log/output.log 2>&1 &

# run distributed training example
sh scripts/run_distribute_train.sh 8 /dataset_path

# run evaluation example
python eval.py \
  --dataset_path='dataset/test' \
  --checkpoint_path='./checkpoint/deepfm.ckpt' \
  --device_target='GPU' > ms_log/eval_output.log 2>&1 &
OR
sh scripts/run_eval.sh 0 GPU /dataset_path /checkpoint_path/deepfm.ckpt

running on CPU

# run training example
python train.py \
  --dataset_path='dataset/train' \
  --ckpt_path='./checkpoint' \
  --eval_file_name='auc.log' \
  --loss_file_name='loss.log' \
  --device_target='CPU' \
  --do_eval=True > ms_log/output.log 2>&1 &

# run evaluation example
python eval.py \
  --dataset_path='dataset/test' \
  --checkpoint_path='./checkpoint/deepfm.ckpt' \
  --device_target='CPU' > ms_log/eval_output.log 2>&1 &

Script Description

Script and Sample Code

.
└─deepfm
  ├─README.md
  ├─mindspore_hub_conf.md             # config for mindspore hub
  ├─scripts
    ├─run_standalone_train.sh         # launch standalone training(1p) in Ascend or GPU
    ├─run_distribute_train.sh         # launch distributed training(8p) in Ascend
    ├─run_distribute_train_gpu.sh     # launch distributed training(8p) in GPU
    └─run_eval.sh                     # launch evaluating in Ascend or GPU
  ├─src
    ├─__init__.py                     # python init file
    ├─config.py                       # parameter configuration
    ├─callback.py                     # define callback function
    ├─deepfm.py                       # deepfm network
    ├─dataset.py                      # create dataset for deepfm
  ├─eval.py                           # eval net
  └─train.py                          # train net

Script Parameters

Parameters for both training and evaluation can be set in config.py

train parameters

optional arguments:
-h, --help            show this help message and exit
--dataset_path DATASET_PATH
                      Dataset path
--ckpt_path CKPT_PATH
                      Checkpoint path
--eval_file_name EVAL_FILE_NAME
                      Auc log file path. Default: "./auc.log"
--loss_file_name LOSS_FILE_NAME
                      Loss log file path. Default: "./loss.log"
--do_eval DO_EVAL     Do evaluation or not. Default: True
--device_target DEVICE_TARGET
                      Ascend or GPU. Default: Ascend

eval parameters

optional arguments:
-h, --help            show this help message and exit
--checkpoint_path CHECKPOINT_PATH
                      Checkpoint file path
--dataset_path DATASET_PATH
                      Dataset path
--device_target DEVICE_TARGET
                      Ascend or GPU. Default: Ascend

Training Process

Training

running on Ascend

python train.py \
  --dataset_path='dataset/train' \
  --ckpt_path='./checkpoint' \
  --eval_file_name='auc.log' \
  --loss_file_name='loss.log' \
  --device_target='Ascend' \
  --do_eval=True > ms_log/output.log 2>&1 &

The python command above will run in the background, you can view the results through the file ms_log/output.log.

After training, you'll get some checkpoint files under ./checkpoint folder by default. The loss value are saved in loss.log file.

2020-05-27 15:26:29 epoch: 1 step: 41257, loss is 0.498953253030777
2020-05-27 15:32:32 epoch: 2 step: 41257, loss is 0.45545706152915955
...

The model checkpoint will be saved in the current directory.

running on GPU

To do.

Distributed Training

running on Ascend
```
sh scripts/run_distribute_train.sh 8 /dataset_path /rank_table_8p.json
```
The above shell script will run distribute training in the background. You can view the results through the file log[X]/output.log. The loss value are saved in loss.log file.
running on GPU

To do.

Evaluation Process

Evaluation

evaluation on dataset when running on Ascend

Before running the command below, please check the checkpoint path used for evaluation.

python eval.py \
  --dataset_path='dataset/test' \
  --checkpoint_path='./checkpoint/deepfm.ckpt' \
  --device_target='Ascend' > ms_log/eval_output.log 2>&1 &
OR
sh scripts/run_eval.sh 0 Ascend /dataset_path /checkpoint_path/deepfm.ckpt

The above python command will run in the background. You can view the results through the file "eval_output.log". The accuracy is saved in auc.log file.

{'result': {'AUC': 0.8057789065281104, 'eval_time': 35.64779996871948}}

evaluation on dataset when running on GPU

To do.

Model Description

Performance

Training Performance

Parameters	Ascend	GPU
Model Version	DeepFM	To do
Resource	Ascend 910; CPU 2.60GHz, 192cores; Memory 755G	To do
uploaded Date	09/15/2020 (month/day/year)	To do
MindSpore Version	1.0.0	To do
Dataset	[1]	To do
Training Parameters	epoch=15, batch_size=1000, lr=1e-5	To do
Optimizer	Adam	To do
Loss Function	Sigmoid Cross Entropy With Logits	To do
outputs	Accuracy	To do
Loss	0.45	To do
Speed	1pc: 8.16 ms/step;	To do
Total time	1pc: 90 mins;	To do
Parameters (M)	16.5	To do
Checkpoint for Fine tuning	190M (.ckpt file)	To do
Scripts	deepfm script	To do

Inference Performance

Parameters	Ascend	GPU
Model Version	DeepFM	To do
Resource	Ascend 910	To do
Uploaded Date	05/27/2020 (month/day/year)	To do
MindSpore Version	0.3.0-alpha	To do
Dataset	[1]	To do
batch_size	1000	To do
outputs	accuracy	To do
Accuracy	1pc: 80.55%;	To do
Model for inference	190M (.ckpt file)	To do

Description of Random Situation

We set the random seed before training in train.py.

ModelZoo Homepage

Please check the official homepage.

README.md Unescape Escape