hong
8c4573a3cb
GradMaker for dygraph ( #19706 )
...
* refactor dygraph,test=develop
* fix failed unittest,test=develop
* polish code,test=develop
* check windows ci error,test=develop
try to fix windows ci error by np.allclose,test=develop
* polish vlog and profiler, test=develop
* try to fix preceding ops order,test=develop
* test transformer in windows ci, test=develop
* use python c-api to speed up tracer.trace,test=develop
* test=develop, fix docker with paddle nccl problem
* test=develop, add ut for debug string and gradient_accumulator
* test=develop, add tests for layer/gradient_accumulator/prepared_op
* test=develop, fix complie error for test_prepared_op
* test=develop, add more ut for dygraph
* test=develop, create API.spec for dygraph api change
* optimize grad maker; test=develop
* optimize grad maker
* test
* grad make optim; test=develop
* fix unittest bugs; test=develop
* add dygraph grad op maker and split_op
* grad op maker refactor; test=develop
* add dygraph grad maker; test=develop
* fix op deformable_conv_v1_op bug; test=develop
* fix deformable_conv prroi pool bugs;
* fix new op grad op maker bug; test=develop
* fix split by ref bug; test=develop
* fix dygraph auto prune bug; test=develop
* fix test_trace bug; test=develop
* fix fused emb seq pool bug; test=develop
* remove useless code in op_desc file; test=develop
* remove useless code, StrVarBaseNode; test=develop
* fix review issues; test=develop
* fix rank_loss grad maker; test=develop
* remove flag in VarBase; test=develop
* fix distributed_notify_op compile bug ; test=develop
* fix reshape op double grad; test=develop
* fix expand as op; test=develop
* add impertive type_defs.h for demo_train; test=develop
* fix inference lib cmake; test=develop
* fix inference lib; test=develop
* fix infernce_lib; test=develop
* fix inference cmake; test=develop
* fix inference lib; test=develop
* fix inference lib; test=develop
* remove condition dygraph grad maker, modify local name; test=develop
* fix split grad maker bug; test=develop
* fix pyramid_op bug; test=develop
* change travis time out limit; test=develop
* restore travis; test=develop
* change timeout limit; test=develop
5 years ago
zhouwei25
b741761098
Integration of third_party compilation structure ( #20887 )
5 years ago
wopeizl
3b31b74e20
remove the warning issue test=develop ( #20718 )
5 years ago
zhouwei25
bcd77e147c
Cmake_generotor support has been added to enable multi-version VS support ( #20755 )
5 years ago
wopeizl
9e5948230e
add support to gcc8, add docker env test=develop ( #19807 )
...
* add support to gcc8, add docker env test=develop
5 years ago
WangXi
507afa8a8a
Fix dgc nan by stripping nccl from sparseReduce. ( #20630 )
5 years ago
石晓伟
48b27229a8
fix version.cmake, test=develop ( #20606 )
5 years ago
633WHU
12e4be0382
Dlpack support ( #20039 )
...
* support dlpack to tensor and implement python interface test=develop
* add unittest for _to_dlpack and from_dlpack test=develop
5 years ago
tangwei12
c9139c3db3
trainer from dataset fetch targets ( #19760 )
...
add executor.FetchHandler for train/infer from the dataset
5 years ago
zhaoyuchen2018
e867366805
Add multihead op for ernie opt ( #19933 )
...
* Add multihead op for ernie opt
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Refine code
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Refine code
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Refine code
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Refine softmax
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Refine kernel.
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Refine code
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Refine code
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Refine code
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Refine cuda kernel
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Refine cuda version
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Refine code
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
* Refine cmake
test=develop
Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>
5 years ago
liym27
3aa331d97e
fix conv2d and conv3d: ( #20042 )
...
1.support asymmetric padding;
2.support padding algorithm:"SAME" and "VALID";
3.support channel_last: data_format NHWC and NDHWC;
4.change doc of python API and c++;
test=develop, test=document_preview
5 years ago
石晓伟
01b9d07963
update operator compatible info, test=develop ( #19978 )
...
* update operator compatible info, test=develop
* revert cmake/version.cmake, test=develop
* add unit_tests and fix bugs, test=develop
* update ../paddle/fluid/framework/framework.proto, test=develop
* fix bug of paddle/fluid/inference/api/analysis_predictor.cc, test=develop
* update paddle/fluid/framework/version_test.cc, test=develop
* add comments and rename interfaces, test=develop
5 years ago
gongweibao
ae593e57fa
Add dgc source code to bos platform. ( #19892 )
...
* add dgc.tgz to bos
5 years ago
Yiqun Liu
3cd985a669
Add a pass to fuse fc+elementwise_add+layernorm ( #19776 )
...
* Add fc_elementwise_layernorm_fuse pass and unittest.
* Add fused_fc_elementwise_layernorm op and its GPU kernel.
test=develop
* Apply fc_elementwise_layernorm_fuse_pass to GPU inference.
* Add the setting of attrs in the definition of binary_op.
test=develop
* Add comment.
* Implement the unittest.
test=develop
* Change the unittest name of layer_norm.
test=develop
5 years ago
chengjuntao
00efd1d8a9
add deformable conv v1 op and cpu version of deformable conv v2 ( #18500 )
...
* add deformable conv v1 op, test=develop
5 years ago
zhouwei25
b5a5d93bbe
fix the dependencies of third party and inference lib ( #19684 )
5 years ago
Huihuang Zheng
12542320c5
Replace TemporaryAllocator by CUDADeviceContextAllocator ( #18989 )
...
TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory.
We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton.
Also added data_feed_proto to operator to fix CI in CPU compilation
6 years ago
Yiqun Liu
a65c728e5d
Implement the GPU kernel of fc operator ( #19687 )
...
* Refine the codes related to fc op.
* Add GPU implementation for fc functor.
* Apply fc_fuse_pass in GPU inference.
test=develop
* Change the cmake for fc op.
* Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ.
* Add an attribute to set the activation type in fc_op.
* Enhance the unittest of fc_op.
test=develop
* Remove the declaration of FCOpGrad back to the header file.
test=develop
* Set default value for newly added arguments in test_fc_op.
test=develop
6 years ago
baojun
87f13f7569
upgrade ngraph to support mkldnn v1.0 ( #19689 )
6 years ago
Tao Luo
bcddbc78d4
remove -Wmaybe-uninitialized warning ( #19653 )
...
* remove -Wmaybe-uninitialized warning
test=develop
* remove uninitialized op_handle_ in scale_loss_grad_op_handle.cc
test=develop
6 years ago
Tao Luo
3aaea4c545
fix inference_lib deps error ( #19632 )
...
test=develop
6 years ago
liuwei1031
9c88570881
fix the warning caused by mistach arguments of flags.cmake ( #19576 )
6 years ago
silingtong123
e79cf3bce7
Enable online compilation of openblas on windows ( #19602 )
...
* test=develop, Support for online compilation of openblas
* test=develop, Modify the prefix of openblas static library
6 years ago
hutuxian
c756b5d231
Paddlebox Framework ( #18982 )
...
* Support looking up embeddings from BoxPS.
* Add a _pull_box_sparse op, for now this op is not exposed to users.
* Add a BoxHelper class, providing 'BeginPass', 'EndPass', 'FeedPass' functions and so on.
* Add 'BoxPSDataset' in python code.
* Add a compile options WITH_BOX_PS and a MACRO PADDLE_WITH_BOX_PS.
* Add UT.
* More concrete information pls refer to: https://github.com/PaddlePaddle/Paddle/pull/18982
6 years ago
liuwei1031
d6cb1a4122
add dynamic C runtime support on windows, test=develop ( #19502 )
6 years ago
Yihua Xu
b920395842
Use sparse matrix to implement fused emb_seq_pool operator ( #19064 )
...
* Implement the operator with sprase matrix multiply
* Update the URL of mklml library.
test=develop
* Disable MKLML implematation when using no-linux.
test=develop
* Ignore the deprecated status for windows
test=develop
6 years ago
Zeng Jinle
5b6673c44d
merge develop to solve conflict, also fix API doc, test=develop ( #18823 )
6 years ago
liuwei1031
50582071dc
fix compilation issue in windows vs2017 ( #19183 )
...
* fix compilation issue in windows vs2017, test=develop
* fix gtest lib not found issue, test=develop
6 years ago
zhouwei25
2f0dc8463a
fix the bug that PYTHON_EXECUTABLE not exists ( #19225 )
...
* test=develop,fix the inference library compilation bug on windows
* test=develop,Fix the inference library compilation bug on windows
* test=develop,fix the bug that PYTHON_EXECUTABLE not exists
6 years ago
zhouwei25
ef46918ad1
Fix the inference library compilation bug on windows ( #19190 )
...
* test=develop,fix the inference library compilation bug on windows
6 years ago
Tao Luo
32a670badc
remove WITH_FAST_MATH option ( #19149 )
...
test=develop
6 years ago
wopeizl
80b7ef6fc8
add tensorrt support for windows ( #19084 )
...
* add tensorrt support for windows
6 years ago
Krzysztof Binias
e1b5833b88
[PROPOSAL] Add support for dynamic code analysis (Sanitizers) ( #18303 )
...
* Add support for dynamic code analysis (Sanitizers)
test=develop
* Move options to one option
test=develop
* Missing check
test=develop
6 years ago
baojun
adcfc53b18
upgrade ngraph version and simplify ngraph engine ( #18853 )
...
* upgrade ngraph to v0.24 test=develop
* simplify io test=develop
6 years ago
Huihuang Zheng
0d3f16f53e
Try to modify external gflags to solve CI compilation ( #18872 )
6 years ago
Tao Luo
8de5aa1bde
remove package.cmake ( #18760 )
...
test=develop
6 years ago
Tao Luo
0ae45f0b53
remove unused cmake file ( #18744 )
...
test=develop
6 years ago
Tao Luo
c457a69db5
remove unused gzstream.cmake ( #18705 )
...
test=develop
6 years ago
Jacek Czaja
0d8e6c9b8b
MKL-DNN upgrade to 0.20 ( #18370 )
...
test=develop
6 years ago
gongweibao
ec1000cca9
Change to use brpc rdma branch instead of personal branch. ( #18683 )
6 years ago
Jiabin Yang
898237c19a
Downgrade gcc to 4.8 ( #18614 )
...
* test=develop, fix docker with paddle nccl problem
* test=develop, downgrade gcc to 4.8 for latest-dev
* test=develop, downgrade gcc to 4.8 for latest-dev
* test=develop, modify cmake to renew all third_party
* test=develop, invoke ci
* test=develop, invoke ci
* test=develop, complie python with wide-unicode
* test=deveop, refine env settings
* test=deveop, refine env settings
6 years ago
guru4elephant
d714bf037c
remove async executor and add data_feed.proto to the deps of train demo ( #18659 )
...
* remove async executor and add data_feed.proto to the deps of train demo
6 years ago
kh2se2013
9ad57f2dfd
1)change to parallel mode on python coverage run ( #18594 )
...
2)add pip install coverage in Dockerfile.tmp
test=develop
6 years ago
kh2se2013
ac81c81be1
unset CMAKE_BUILD_TYPE when WITH_COVERAGE = ON ( #18541 )
...
install coverage package in develop image
test = develop
6 years ago
石晓伟
1529154821
Support Bitmain Anakin ( #18542 )
...
* update anakin-engine interfaces for content-dnn
test=develop
* support only-gpu mode of Anakin
modify eltwise parse
test=develop
* modification for thread-safe
test=develop
* Integrated template instance
test=develop
* increase template parameters
test=develop
* support MLU predictor
test=develop
* update anakin cmake files
test=develop
* update TargetWrapper::set_device
* update the initialization of anakin subgraph
test=develop
* use the default constructor of base class
test=develop
* load model from buffer with length
test=develop
* modify the access level of class
test=develop
* support anakin for bitmain arch
test=develop
* remove files
* checkout cmakelists
test=develop
6 years ago
石晓伟
047bba855b
Remove the obsolete cmake options ( #18481 )
...
* remove the obsolete cmake options, test=develop
* remove unittests, test=develop
6 years ago
guru4elephant
ef81ff742a
update pslib library path ( #18415 )
...
change url of pslib.tar.gz
6 years ago
kh2se2013
27fb9cad65
add WITH_COVERAGE option, default OFF ( #17872 )
...
* add WITH_COVERAGE option, default OFF
test=develop
* add coverage for python sdk
test=develop
* fix code style
* fix COVERAGE_FILE path
test=develop
* remove coverage package
test=develop
* test = develop, run coverage as module
6 years ago
Tao Luo
3c9755bbb9
remove unused jemalloc option ( #18314 )
...
test=develop
6 years ago
wopeizl
daa32d5383
fix package generation for inference test=develop ( #18220 )
6 years ago
Wojciech Uss
c26130f3a9
reuse C-API INT8 unit test application ( #18077 )
...
* reuse C-API INT8 unit test application
test=develop
* updates after review
test=develop
6 years ago
Michał Gallus
8462e2b805
Disable MKLDNN FC in Resnet50 test ( #18030 )
6 years ago
tensor-tang
5c06bff222
combine noavx and avx package ( #17889 )
...
* support avx and noavx core
* add catch and give some log
test=develop
* fix build
test=develop
* add missing package
test=develop
* fix pybind name
test=develop
* fix import error
test=develop
* conbime noavx core
test=develop
* add requirements
test=develop
* fix unkown message
test=develop
* fix api spec
test=develop
* refine and clean
test=develop
* update
* pass dist ut
* follow comments
test=develop
* refine scripts
test=develop
6 years ago
石晓伟
bce259e5bf
Update the Anakin interfaces for content-dnn and MLU ( #17890 )
...
* update anakin-engine interfaces for content-dnn
test=develop
* support only-gpu mode of Anakin
modify eltwise parse
test=develop
* modification for thread-safe
test=develop
* Integrated template instance
test=develop
* increase template parameters
test=develop
* support MLU predictor
test=develop
* update anakin cmake files
test=develop
* update TargetWrapper::set_device
* update the initialization of anakin subgraph
test=develop
* use the default constructor of base class
test=develop
6 years ago
wopeizl
3d0e1204d6
add support for cuda9 on windows test=develop ( #17594 )
...
* add support for cuda9 on windows test=develop
* use different git address for cuda9 compatible on windows
6 years ago
wopeizl
82b834cbdb
use the bj as default address instead of cdn test=develop ( #17795 )
...
The cdn.bcebos.com can be unstable randomly for unknown reason, restore it to bj.bcebos.com.
6 years ago
wopeizl
f893914f1f
fix the dll not found issue on windows ( #17750 )
...
* fix the dll not found issue on windows
6 years ago
baojun
2c58f1a83c
[NGraph] Added lookup table to ngraph engine test=develop ( #17647 )
6 years ago
Bai Yifan
bba57cdd82
Add deformable conv v2 op,test=develop ( #17145 )
...
* unit commits, test=develop
* update API.spec, test=develop
6 years ago
Yiqun Liu
5782dddad0
Optimize the concat and split kernel for specical cases when the number of inputs/outputs is 2 ( #17415 )
...
* Optimize the concat and split kernel for special cases that the number of inputs/outputs is 2.
test=develop
* Refine codes.
test=develop
* Correct the condition.
test=develop
* Move the define of tmp_data outside the if statement.
* Print the cudnn minor version.
test=develop
* Fix the case when in_num/o_num is 1 in concat/split op.
test=develop
* Remove const_cast.
test=develop
6 years ago
Michał Gallus
0c39b97b4e
[MKL-DNN] Add Fully Connected Op for inference only( #15226 )
...
* fuse mul and elementwise add to fc
* Reimplement the FC forward operator
* Fix FC MKLDNN integration by transposing weights
* Add FC MKLDNN Pass
test=develop
* FC MKLDNN Pass: change memcpy to std::copy
* Fix MKLDNN FC handling of mismatch input and weights dims
* Lower tolerance for MKL-DNN in resnet50 test
test=develop
* Adjust FC to support MKLDNN Op placement
test=develop
* Adjust Placement Op to set use_mkldnn attribute for graph
test=develop
* MKLDNN FC: fix weights format so that gemm version is called
test=develop
* FC MKLDNN: Remove tolerance decrease from tester_helper
* FC MKL-DNN: Refactor the code, change input reorder to weight reorder
* MKL-DNN FC: Introduce operator caching
test=develop
* FC MKL-DNN: Fix the tensor type in ExpectedKernelType
test=develop
* FC MKL-DNN: fix style changes
test=develop
* FC MKL-DNN: fallback to native on non-supported dim sizes
test=develop
* FC MKLDNN: fix CMake paths
test=develop
* FC MKLDNN: Refine placement pass graph mkldnn attribute
test=develop
* Fix Transpiler error for fuse_conv_eltwise
test=develop
* Fix missing STL includes in files
test=develop
* FC MKL-DNN: Enable new output size computation
Also, refine pass to comply with newest interface.
test=develop
* FC MKL-DNN: enable only when fc_mkldnn_pass is enabled
* FC MKL-DNN: Allow Weights to use oi or io format
* FC MKL-DNN: Adjust UT to work with correct dims
test=develop
* Enable MKL DEBUG for resnet50 analyzer
test=develop
* FC MKL-DNN: Improve Hashing function
test=develop
* FC MKL-DNN: Fix shape for fc weights in transpiler
* FC MKL-DNN: Update input pointer in re-used fc primitive
* Add log for not handling fc fuse for unsupported dims
test=develop
* FC MKL-DNN: Move transpose from pass to Op Kernel
test=develop
* FC MKL-DNN: Disable transpose in unit test
test=develop
* FC MKL-DNN: Remove fc_mkldnn_pass from default list
* Correct Flag for fake data analyzer tests
test=develop
* FC MKL-DNN: Add comment about fc mkldnn pass disablement
test=develop
* FC MKL-DNN: Disable fc in int8 tests
test=develop
6 years ago
mozga-intel
6101fd57ad
update ngraph to v0.19 test=develop ( #17582 )
6 years ago
Tao Luo
3d19f44a89
remove unused SERIAL compiler option ( #17500 )
...
test=develop
6 years ago
wopeizl
ca3ba378c7
fix the random compilation failure on windows test=develop ( #17475 )
...
* fix the random compilation failure on windows
6 years ago
jiaqi
66d51206b1
add save/load model, shrink table, cvm, config file & fix pull dense bug ( #17118 )
...
* add save/load model, shrink table, cvm, config file & fix pull dense bug
test=develop
* fix global shuffle bug, fix pull dense bug, fix release memeory bug, fix shrink error
add client flush, add get data size
test=develop
* fix global shuffle bug
test=develop
* fix global shuffle bug
test=develop
* fix code style
test=develop
* fix code style & modify pslib cmake
test=develop
* fix error of _role_maker
test=develop
* fix code style
test=develop
* fix code style
test=develop
* fix code style
test=develop
* fix code style
test=develop
* fix code style
test=develop
* fix windows compile error of fleet
test=develop
* fix global shuffle bug
* add comment
test=develop
* update pslib.cmake
test=develop
* fix fill sparse bug
test=develop
* fix push sparse bug
test=develop
6 years ago
Jiabin Yang
c843e64cf5
Revert "rename the default version from '0.0.0' to 'latest' ( #17304 )" ( #17356 )
...
This reverts commit f456c8beb8
.
6 years ago
wopeizl
f456c8beb8
rename the default version from '0.0.0' to 'latest' ( #17304 )
...
* rename the default version from '0.0.0' to 'latest'
6 years ago
Tao Luo
ff1661f12a
remove unused FLAGS_warpctc_dir ( #17162 )
...
* remove unused FLAGS_warpctc_dir
test=develop
* remove FLAGS_warpctc_dir
test=develop
6 years ago
石晓伟
a72dbe9abf
Cherry-pick benchmark related changes from release/1.4 ( #17156 )
...
* cherry-pick commit from 8877054
* cherry-pick commit from 3f0b97d
* cherry-pick from 16691:Anakin subgraph support yolo_v3 and faster-rcnn
(cherry picked from commit 8643dbc233
)
* Cherry-Pick from 16662 : Anakin subgraph cpu support
(cherry picked from commit 7ad182e16c
)
* Cherry-pick from 1662, 16797.. : add anakin int8 support
(cherry picked from commit e14ab180fe
)
* Cherry-pick from 16813 : change singleton to graph RegistBlock
test=release/1.4
(cherry picked from commit 4b9fa42307
)
* Cherry Pick : 16837 Support ShuffleNet and MobileNet-v2
Support ShuffleNet and MobileNet-v2, test=release/1.4
(cherry picked from commit a6fb066f90
)
* Cherry-pick : anakin subgraph add opt config layout argument #16846
test=release/1.4
(cherry picked from commit 8121b3eccb
)
* 1. add shuffle_channel_detect
(cherry picked from commit 6efdea8997
)
* update shuffle_channel op convert, test=release/1.4
(cherry picked from commit e4726a066f
)
* Modify symbol export rules
test=develop
6 years ago
baojun-nervana
855bb4d408
update ngraph to v0.18 test=develop
6 years ago
gongweibao
cbdb8a17b1
Polish DGC code ( #16818 )
6 years ago
wopeizl
b6150e1fa7
disable the share lib for protobuf test=develop ( #16778 )
6 years ago
Chen Weihang
0b2aec14b6
Revert "Model data cryption link all lib ( #16555 )"
...
test=develop
This reverts commit c38c7c5619
.
6 years ago
Chen Weihang
c38c7c5619
Model data cryption link all lib ( #16555 )
...
* link the libwbaes.so into paddle
* polish detail, test=develop
* try fix mac_pr_ci error, test=develop
* add compile option, test=develop
* fix ci error, test=develop
* ignore failed to find mac lib, test=develop
* change cdn to bj, cdn can't get the latest version
* trigger ci, test=develop
* temporary delete win32 lib linking, test=develop
* change https to http, test=develop
* turn compile option on to off
* turn compile option off to on, test=develop
* try lib compiled by gcc4.8, test=develop
* update lib version, test=develop
* link other lib, test=develop
* add setup config
* delete false, test=develop
* delete no_soname, test=develop
* recover so name set
* fix, test=develop
* adjust make config, test=develop
* remove link to wbaes, test=develop
* remove useless define, test=develop
6 years ago
石晓伟
5dea0bdd1b
Merge pull request #16498 from Shixiaowei02/feature/anakin-engine
...
merge feature/anakin-engine to develop
6 years ago
gongweibao
fea91164b7
Fix windows compilation error! ( #16546 )
...
* fix compiled
test=develop
* follow comments test=develop
6 years ago
Shixiaowei02
bddb2cd315
resolve conflicts with the develop branch test=develop
6 years ago
gongweibao
eb83abeac3
Add DGC(Deep Gradient Compression) interface. ( #15841 )
6 years ago
baojun
b1d2605152
fix compile issue test=develop ( #16447 )
6 years ago
nhzlx
953bdde058
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into HEAD
...
test=develop
6 years ago
liuwei1031
de3b70a101
fix cdn issue, test=develop ( #16423 )
...
* fix cdn issue, test=develop
* fix cdn issue, test=develop
6 years ago
nhzlx
f3a2e4b3d8
1. Add ANAKIN_ROOT compile option
...
2. refine trt code
test=develop
6 years ago
qingqing01
8ad672a287
Support sync batch norm. ( #16121 )
...
* Support Sync Batch Norm.
* Note, do not enable it in one device.
Usage:
build_strategy = fluid.BuildStrategy()
build_strategy.sync_batch_norm = True
binary = fluid.compiler.CompiledProgram(tp).with_data_parallel(
loss_name=loss_mean.name,
build_strategy=build_strategy)
6 years ago
Brian Liu
db120b9392
Upgrade MKLDNN to v0.18-rc and fix issue caused by lib/lib64 ( #15861 )
...
* Upgrade MKLDNN to v0.18-rc and fix issue caused by lib/lib64
Upgrade MKLDNN to v0.18-rc
Also fix the issue during upgrade
test=develop
* Rebase MKLDNN to rls-v0.18 branch
Some issues in v0.18-rc which caused INT8 conv op unit test failure was fixed
in rls-v0.18 branch
test=develop
* Upgrade MKLDNN from v0.18rc to formal v0.18 tag
test=develop
* Fix the windows compile issue.
test=develop
6 years ago
Tao Luo
344f098a34
Merge pull request #15963 from baojun-nervana/ngraph_v14
...
Fix lib64 issue on centos
6 years ago
Tao Luo
4efdebc6f6
Merge pull request #15931 from yihuaxu/develop_2c5c7b2a7_gelu_mkl_opt
...
Optimize gelu operation with mkl erf
6 years ago
baojun-nervana
b51e4dc0a4
fix lib64 test=develop
6 years ago
Tao Luo
47d36b2008
Merge pull request #15924 from baojun-nervana/ngraph_v14
...
Update ngraph version to v0.14
6 years ago
dzhwinter
225c11a91f
polish cudnn related code and fix bug. ( #15164 )
...
* staged.
* polish code
* polish code. test=develop
* polish code. test=develop
* api change. test=develop
* fix default value. test=develop
* fix default value. test=develop
6 years ago
Yihua Xu
7396788694
Optimize gelu operation with mkl erf.
...
test=develop
6 years ago
baojun-nervana
2ffacdebc2
Update ngraph version to v0.14 test=develop
6 years ago
liangan1
4acc522087
Enable function coverage for U8/S8 ConvMKLDNNOpKernel
...
test=develop
6 years ago
tensor-tang
ee2321debd
Revert 15770 develop a6910f900
gelu mkl opt ( #15872 )
...
* Revert "Optimze Gelu with MKL Erf function (#15770 )"
This reverts commit 676995c86c
.
* test=develop
6 years ago
Yihua Xu
676995c86c
Optimze Gelu with MKL Erf function ( #15770 )
...
* Optimize for gelu operator
* Set up the low accuracy mode of MKL ERF function.
test=develop
* Only enable MKLML ERF when OS is linux
* Use the speical mklml version included vmsErf function to verify gelu mkl kernel.
test=develop
* Add the CUDA macro to avoid NVCC's compile issue.
test=develop
* Add the TODO comments for mklml library modification.
test=develop
* Clean Code
test=develop
* Add the comment of marco for NVCC compiler.
test=develop
6 years ago
JiabinYang
ba38be7242
test=develop, fix protobuf runtime update and keep lib in 3.1.0
6 years ago
Tao Luo
50ffed27f6
Merge pull request #15813 from luotao1/legacy_any
...
remove legacy any.cmake
6 years ago
Tao Luo
60cb0b9781
remove legacy $external_project_dependencies variable
...
test=develop
6 years ago
Tao Luo
c797a1f050
remove legacy any.cmake
6 years ago
Tao Luo
f52d372876
remove legacy EXTERNAL_LIBS variable
...
test=develop
6 years ago
Tao Luo
0d38817cf4
remove legacy EIGEN_USE_THREADS, WITH_ARM_FP16 options
6 years ago
Tao Luo
978599154f
remove legacy WITH_GOLANG, GLIDE_INSTALL options
6 years ago
Tao Luo
f522b4417f
remove legacy WITH_TIMER, WITH_DOC, ON_TRAVIS options
6 years ago
Tao Luo
ff2a8386a0
remove legacy USE_EIGEN_FOR_BLAS option
6 years ago
Tao Luo
688023ede0
remove legacy WITH_RDMA option
6 years ago
Tao Luo
6311ae5df9
remove legacy WITH_DOUBLE option
6 years ago
JiabinYang
48cf979a21
test=develop, install requirements before start for Linux
6 years ago
JiabinYang
fe7ffedc1a
test=develop, update protobuf
6 years ago
dzhwinter
02a585b5c7
add details. test=develop
6 years ago
dzhwinter
04e9776aef
add details. test=develop
6 years ago
wopeizl
3614dadf23
Merge pull request #15631 from wopeizl/windows/fixci
...
fix ci broken randomly and disable some warnings
6 years ago
peizhilin
805d505f14
disable warnings for third parties
...
test=develop
6 years ago
Yan Xu
c356bd01e9
fix invalide paddle_version on tag branch test=develop ( #15551 )
6 years ago
peizhilin
3a4110f960
fix ci broken randomly and disable some warnings
...
test=develop
6 years ago
Krzysztof Binias
b1bdcd4de8
Make separate folders for mkldnn codes
...
test=develop
6 years ago
Tao Luo
c42ef5bf05
remove legacy WITH_DOC option
...
test=develop
6 years ago
chengduo
7166b52a6e
add limit_of_tmp_allocation for CI ( #15513 )
...
test=develop
6 years ago
Tao Luo
df92d05ef3
remove legacy IOS option
...
test=develop
6 years ago
Tao Luo
cf29ea1592
remove legacy ANDROID option
6 years ago
Tao Luo
3ce10dba15
remove legacy USE_NNPACK option
6 years ago
Tao Luo
2d529186f1
remove legacy CMAKE_CROSSCOMPILING option
6 years ago
Tao Luo
9353bc58dd
remove legacy MOBILE_INFERENCE option
6 years ago
Tao Luo
b4ccae75c0
remove legacy target in cmake/util.cmake
6 years ago
Tao Luo
e000d17a0c
remove legacy WITH_SWIG_PY option
6 years ago
Tao Luo
561ae9d507
remove legacy WITH_C_API option
6 years ago
Wu Yi
7e651a38dd
fix mac cmake version 3.13 build ( #15386 )
...
* fix mac cmake version 3.13 test=develop
* fix again test=develop
6 years ago
Yiqun Liu
568cc2ffa8
Optimize while_op for test ( #14764 )
...
* Simplify the compare op for CPU.
* Use asynchronous tensor copy in reshape_op's kernel.
* Optimize while_op for test, avoiding creating variables every time.
test=develop
* Enable the cache of kernel type and kernel function.
test=develop
* Enable profiling with gperftools.
* Remove flags for testing, and fix the linking error.
test=develop
* Delete the codes of ChooseKernel.
test=develop
* Fix bug when preparing ExecutorPrepareContext for while_op.
* Fix missing depending on grpc libraries.
* Remove the redundant print.
test=develop
* Follow comments.
* Remove the codes related to prepare the ExecutorPrepareContext for while_op.
test=develop
6 years ago
peizhilin
439691f5bd
adjust the shlwapi on windows
...
test=develop
6 years ago
peizhilin
92da467c99
Merge remote-tracking branch 'upstream/develop' into windows/fixgpuissue
6 years ago
Sang Ik Lee
9181dea9f3
Set correct TBB library name in debug build and remove warning related to rpath dependency from symlink.
...
test=develop
6 years ago
baojun-nervana
bb9f7a14a0
Fix cmake warning test=develop
6 years ago
Tao Luo
f23a257e90
use the new MKLDNN repo url
...
test=develop
6 years ago
chengduo
55a0672378
fix compute_75 of cuda_cmake ( #15209 )
...
test=develop
6 years ago
Jiabin Yang
7b8b42689a
Merge pull request #15190 from luotao1/mklml_update
...
update mklml version
6 years ago
xuezhong
c0bc818688
Merge pull request #15188 from velconia/add_pyramid_dnn_support
...
Add no lock optimization pass
6 years ago
Tao Luo
49c31e5da4
disable mkl for mac
...
test=develop
6 years ago
chengduo
b1ea335f60
add sm_75 support ( #15198 )
...
test=develop
6 years ago
minqiyang
68a07328fa
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_pyramid_dnn_support
...
test=develop
6 years ago
Tao Luo
ee59e60f77
update mklml version
...
test=develop
6 years ago
minqiyang
4bfa110fd8
Add no lock optimize pass
...
test=develop
6 years ago
Qiyang Min
1df2399e00
Merge pull request #15180 from velconia/add_pyramid_dnn_support
...
Add JeMalloc
6 years ago
Yan Chunwei
875a07c32d
refactor inference analysis api ( #14634 )
6 years ago
minqiyang
583f7ce173
Add dynamic jemalloc modules
...
test=develop
6 years ago
baojun-nervana
f0cde74564
Update ngraph with elt-wise relu test=develop
6 years ago
peizhilin
25523bb8e6
test=develop
6 years ago
peizhilin
9ae50dd07d
fix gpu buils issue on windows test=develop
6 years ago
Jiabin Yang
adc96e06d9
Merge pull request #15107 from luotao1/mkl_version_update
...
update mkl version, and add mkl-mac version
6 years ago
Tao Luo
d319ffcd27
update mkl version, and add mkl-mac version
...
test=develop
6 years ago
qingqing01
6f0a1d7b47
Inception fusion operator. ( #14968 )
...
* Inception fusion operator.
* Support horizontal layer fusion in conv_fusion_op.
* Search conv algo strategy for variable-length input.
search N times and cache the searched algos. For other input, choose the algo of input whose area is closest to this input.
6 years ago
wopeizl
7ab501264d
Merge pull request #15069 from wopeizl/windows/dsosupport
...
add cuda dso support for windows
6 years ago
baojun-nervana
555fbc10d8
upgrade ngraph to v0.10.1 test=develop
6 years ago
Yu Yang
efa630eadb
Refine Dockerfile ( #14908 )
...
* Refine Dockerfile
* Add tasks, cmake gen
* Fix code error
* Disable compile after paddle_build.sh
* Refine
* Skip on PY35 CI
* Change env
* Refine paddle_build.sh
* Expose gen_fluid_lib
* Refine mkldnn.cmake
* Refine mkldnn.cmake
* Refine mkldnnlib
* Skip unstable tests
6 years ago
peizhilin
01c00b07dd
fix test issues on windows
...
test=develop
6 years ago
peizhilin
1e7f83e60a
add cuda dso support for windows
...
test=develop
6 years ago
gongweibao
00dadb0720
fix apple cuddn complation error test=develop ( #15003 )
6 years ago
peizhilin
f31d65454c
use the default cdn address for mklml package on windows
...
test=develop
6 years ago
peizhilin
b6d7f0e5ec
use the CDN as the source location
...
test=develop
6 years ago
peizhilin
1cc9d59838
disable xbyak on windows
...
test=develop
6 years ago
peizhilin
40a94a138f
remove irrelevant fix for mkl
...
test=develop
6 years ago
peizhilin
07c7eaabb4
Merge remote-tracking branch 'upstream/develop' into windows/mkl
...
test=develop
6 years ago
peizhilin
19ebd8b4cf
add ctc support for windows
6 years ago
peizhilin
17fb3253c3
keep the mkl win's version inconsistent with Linux's
...
test=develop
6 years ago
peizhilin
fa135bbf52
Fix the mkl build script on windows
...
test=develop
6 years ago
colourful-tree
44ad2f4479
Merge pull request #14873 from colourful-tree/develop
...
add pslib(pserver) to paddle, an industrial scale high performance parameter server library
6 years ago
Yu Yang
2803cf5776
Merge pull request #14868 from reyoung/feature/refine_w2v
...
Feature/refine w2v
6 years ago
Zhaolong Xing
3e32a46490
Merge pull request #14916 from NHZlX/copy_trt_lib_to_inference_lib
...
copy trt header and lib to fluid_inference_install_dir/third_party/install/tensorrt
6 years ago
peizhilin
b601f2de8d
include the mkl fix only
...
test=develop
6 years ago
guru4elephant
a79a3ea2f0
Merge branch 'develop' into develop
6 years ago
peizhilin
5a6d7fe2ff
add mkl,ctc support for windows
6 years ago
wopeizl
0f085f0a5a
Merge pull request #14892 from wopeizl/windows/port3
...
fix script issue
6 years ago
nhzlx
4e3e68dfae
copy trt lib to inference lib test=develop
6 years ago
Yu Yang
4de1a8bd9d
Remove unused cmake log
...
test=develop
6 years ago
Yu Yang
740e1626ce
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/refine_w2v
...
test=develop
6 years ago
gongweibao
0b1c7d838c
Add brpc serialization support. ( #11430 )
6 years ago
peizhilin
23dec78772
fix script issue
...
test=develop
6 years ago
heqiaozhi
f81957a753
refine cmake for pslib & pre_define
6 years ago
Yu Yang
15550a2753
Polish code
6 years ago
heqiaozhi
2912d5311b
fix code style bug & change pslib.cmake & change Cmakelist adapt pslib
6 years ago
heqiaozhi
c4cb414291
refine pslib.cmake url to public
6 years ago
Yu Yang
8175983ef9
Merge pull request #14814 from reyoung/feature/gprof
...
Add gperftools supports for PE
6 years ago
Yu Yang
7604b1ad51
Fix Eigen macro when using GPU
...
The macro should be defined by compiler rather than by source.
test=develop
6 years ago
Yu Yang
f0c0bf328d
Add gperftools supports for PE
6 years ago
Xin Pan
41c28d54c6
allow customize kernel selection
...
test=develop
6 years ago
heqiaozhi
419506f510
refine for compile pslib.so
6 years ago
heqiaozhi
a77fa67bbd
async_thread_trainer & libmct & pslib.cmake
6 years ago
Xin Pan
7e0801d4ed
Merge pull request #14441 from baojun-nervana/intel/ngraph_op
...
Implementing ngraph engine
6 years ago
heqiaozhi
4798a8c7b8
pslib_brpc
6 years ago
heqiaozhi
038346c0c2
libmct
6 years ago
heqiaozhi
3c239cd640
pslib
6 years ago
Sang Ik Lee
24e70920db
Refactor some build settings.
...
test=develop
6 years ago
Sang Ik Lee
d6125a5eec
Include ngraph in inference demo build.
...
test=develop
6 years ago
Qiao Longfei
bcad29c680
gzstream depend on the zlib in thirdparty
...
test=develop
6 years ago
Qiao Longfei
35b79ab865
Merge pull request #13983 from jacquesqiao/add-ctr-reader
...
Add ctr reader
6 years ago
Qiao Longfei
1edd435da6
fix ci problem test=develop
6 years ago
Tao Luo
1538059ba3
Merge pull request #14595 from luotao1/clean_infer_library
...
clean inference include files
6 years ago
Qiao Longfei
668ae9083e
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-ctr-reader
6 years ago
wopeizl
05b7ee7eeb
Merge pull request #14545 from wopeizl/windows/online
...
Windows/online
6 years ago
Tao Luo
c0b3f93bff
clean inference include files
...
test=develop
6 years ago
minqiyang
e43f5bc77c
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_dist_resnet_ut_in_py36
...
test=develop
6 years ago
peizhilin
6250be4b5c
Merge branch 'windows/build' into windows/online
...
test=develop
6 years ago
peizhilin
30849d1f20
Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago