caojiewen
da60f433f1
|
4 years ago | |
---|---|---|
.. | ||
scripts | 4 years ago | |
src | 4 years ago | |
README.md | 4 years ago | |
README_CN.md | 4 years ago | |
eval.py | 4 years ago | |
export.py | 4 years ago | |
mindspore_hub_conf.py | 4 years ago | |
train.py | 4 years ago |
README.md
Contents
- DeepFM Description
- Model Architecture
- Dataset
- Environment Requirements
- Quick Start
- Script Description
- Model Description
- Description of Random Situation
- ModelZoo Homepage
DeepFM Description
Learning sophisticated feature interactions behind user behaviors is critical in maximizing CTR for recommender systems. Despite great progress, existing methods seem to have a strong bias towards low- or high-order interactions, or require expertise feature engineering. In this paper, we show that it is possible to derive an end-to-end learning model that emphasizes both low- and high-order feature interactions. The proposed model, DeepFM, combines the power of factorization machines for recommendation and deep learning for feature learning in a new neural network architecture.
Paper: Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, Xiuqiang He. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction
Model Architecture
DeepFM consists of two components. The FM component is a factorization machine, which is proposed in to learn feature interactions for recommendation. The deep component is a feed-forward neural network, which is used to learn high-order feature interactions. The FM and deep component share the same input raw feature vector, which enables DeepFM to learn low- and high-order feature interactions simultaneously from the input raw features.
Dataset
- [1] A dataset used in Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, Xiuqiang He. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction[J]. 2017.
Environment Requirements
- Hardware(Ascend/GPU/CPU)
- Prepare hardware environment with Ascend, GPU, or CPU processor.
- Framework
- For more information, please check the resources below:
Quick Start
After installing MindSpore via the official website, you can start training and evaluation as follows:
-
running on Ascend
# run training example python train.py \ --dataset_path='dataset/train' \ --ckpt_path='./checkpoint' \ --eval_file_name='auc.log' \ --loss_file_name='loss.log' \ --device_target='Ascend' \ --do_eval=True > ms_log/output.log 2>&1 & # run distributed training example sh scripts/run_distribute_train.sh 8 /dataset_path /rank_table_8p.json # run evaluation example python eval.py \ --dataset_path='dataset/test' \ --checkpoint_path='./checkpoint/deepfm.ckpt' \ --device_target='Ascend' > ms_log/eval_output.log 2>&1 & OR sh scripts/run_eval.sh 0 Ascend /dataset_path /checkpoint_path/deepfm.ckpt
For distributed training, a hccl configuration file with JSON format needs to be created in advance.
Please follow the instructions in the link below:
-
running on GPU
For running on GPU, please change
device_target
fromAscend
toGPU
in configuration file src/config.py# run training example python train.py \ --dataset_path='dataset/train' \ --ckpt_path='./checkpoint' \ --eval_file_name='auc.log' \ --loss_file_name='loss.log' \ --device_target='GPU' \ --do_eval=True > ms_log/output.log 2>&1 & # run distributed training example sh scripts/run_distribute_train.sh 8 /dataset_path # run evaluation example python eval.py \ --dataset_path='dataset/test' \ --checkpoint_path='./checkpoint/deepfm.ckpt' \ --device_target='GPU' > ms_log/eval_output.log 2>&1 & OR sh scripts/run_eval.sh 0 GPU /dataset_path /checkpoint_path/deepfm.ckpt
-
running on CPU
# run training example python train.py \ --dataset_path='dataset/train' \ --ckpt_path='./checkpoint' \ --eval_file_name='auc.log' \ --loss_file_name='loss.log' \ --device_target='CPU' \ --do_eval=True > ms_log/output.log 2>&1 & # run evaluation example python eval.py \ --dataset_path='dataset/test' \ --checkpoint_path='./checkpoint/deepfm.ckpt' \ --device_target='CPU' > ms_log/eval_output.log 2>&1 &
Script Description
Script and Sample Code
.
└─deepfm
├─README.md
├─mindspore_hub_conf.md # config for mindspore hub
├─scripts
├─run_standalone_train.sh # launch standalone training(1p) in Ascend or GPU
├─run_distribute_train.sh # launch distributed training(8p) in Ascend
├─run_distribute_train_gpu.sh # launch distributed training(8p) in GPU
└─run_eval.sh # launch evaluating in Ascend or GPU
├─src
├─__init__.py # python init file
├─config.py # parameter configuration
├─callback.py # define callback function
├─deepfm.py # deepfm network
├─dataset.py # create dataset for deepfm
├─eval.py # eval net
└─train.py # train net
Script Parameters
Parameters for both training and evaluation can be set in config.py
-
train parameters
optional arguments: -h, --help show this help message and exit --dataset_path DATASET_PATH Dataset path --ckpt_path CKPT_PATH Checkpoint path --eval_file_name EVAL_FILE_NAME Auc log file path. Default: "./auc.log" --loss_file_name LOSS_FILE_NAME Loss log file path. Default: "./loss.log" --do_eval DO_EVAL Do evaluation or not. Default: True --device_target DEVICE_TARGET Ascend or GPU. Default: Ascend
-
eval parameters
optional arguments: -h, --help show this help message and exit --checkpoint_path CHECKPOINT_PATH Checkpoint file path --dataset_path DATASET_PATH Dataset path --device_target DEVICE_TARGET Ascend or GPU. Default: Ascend
Training Process
Training
-
running on Ascend
python train.py \ --dataset_path='dataset/train' \ --ckpt_path='./checkpoint' \ --eval_file_name='auc.log' \ --loss_file_name='loss.log' \ --device_target='Ascend' \ --do_eval=True > ms_log/output.log 2>&1 &
The python command above will run in the background, you can view the results through the file
ms_log/output.log
.After training, you'll get some checkpoint files under
./checkpoint
folder by default. The loss value are saved in loss.log file.2020-05-27 15:26:29 epoch: 1 step: 41257, loss is 0.498953253030777 2020-05-27 15:32:32 epoch: 2 step: 41257, loss is 0.45545706152915955 ...
The model checkpoint will be saved in the current directory.
-
running on GPU
To do.
Distributed Training
-
running on Ascend
sh scripts/run_distribute_train.sh 8 /dataset_path /rank_table_8p.json
The above shell script will run distribute training in the background. You can view the results through the file
log[X]/output.log
. The loss value are saved in loss.log file. -
running on GPU
To do.
Evaluation Process
Evaluation
-
evaluation on dataset when running on Ascend
Before running the command below, please check the checkpoint path used for evaluation.
python eval.py \ --dataset_path='dataset/test' \ --checkpoint_path='./checkpoint/deepfm.ckpt' \ --device_target='Ascend' > ms_log/eval_output.log 2>&1 & OR sh scripts/run_eval.sh 0 Ascend /dataset_path /checkpoint_path/deepfm.ckpt
The above python command will run in the background. You can view the results through the file "eval_output.log". The accuracy is saved in auc.log file.
{'result': {'AUC': 0.8057789065281104, 'eval_time': 35.64779996871948}}
-
evaluation on dataset when running on GPU
To do.
Model Description
Performance
Training Performance
Parameters | Ascend | GPU |
---|---|---|
Model Version | DeepFM | To do |
Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory 755G | To do |
uploaded Date | 09/15/2020 (month/day/year) | To do |
MindSpore Version | 1.0.0 | To do |
Dataset | [1] | To do |
Training Parameters | epoch=15, batch_size=1000, lr=1e-5 | To do |
Optimizer | Adam | To do |
Loss Function | Sigmoid Cross Entropy With Logits | To do |
outputs | Accuracy | To do |
Loss | 0.45 | To do |
Speed | 1pc: 8.16 ms/step; | To do |
Total time | 1pc: 90 mins; | To do |
Parameters (M) | 16.5 | To do |
Checkpoint for Fine tuning | 190M (.ckpt file) | To do |
Scripts | deepfm script | To do |
Inference Performance
Parameters | Ascend | GPU |
---|---|---|
Model Version | DeepFM | To do |
Resource | Ascend 910 | To do |
Uploaded Date | 05/27/2020 (month/day/year) | To do |
MindSpore Version | 0.3.0-alpha | To do |
Dataset | [1] | To do |
batch_size | 1000 | To do |
outputs | accuracy | To do |
Accuracy | 1pc: 80.55%; | To do |
Model for inference | 190M (.ckpt file) | To do |
Description of Random Situation
We set the random seed before training in train.py.
ModelZoo Homepage
Please check the official homepage.