Commit Graph

217 Commits (7f794ea563ceaee434cb15f168414e48e06a3ca0)

Author SHA1 Message Date
Darcy 8090eb6272 added proto_desc to device_tracer's dep list (#9342)
8 years ago
Yu Yang 1d8fe2a220 Enhance device context pool (#9293)
8 years ago
Yu Yang 5c333e4143 Add dctor for dev_ctx
8 years ago
Yu Yang fe7ed285d1 Extract NCCLCtxMap
8 years ago
Kexin Zhao ed2bc194c5
Merge pull request #9176 from kexinzhao/batch_norm_fp16
8 years ago
Yu Yang 6ebc6bf533 ReorganizeCode
8 years ago
Yu Yang 41ad632341 Add NCCL Group Guard
8 years ago
Yu Yang 99fe83a020 Move nccl helper
8 years ago
Yu Yang a0494f8e55 Mutex lock wait
8 years ago
Kexin Zhao d307b5e4a6 Merge remote-tracking branch 'upstream/develop' into elementwise_add_fp16
8 years ago
Kexin Zhao 182da95317 small fix
8 years ago
Kexin Zhao f2bbbb2b66 fix arithmetic operator
8 years ago
Kexin Zhao 18d616ed70 add float16 arithmetic operators on new GPU
8 years ago
Yu Yang 3aa7051b98 Remove DevCtx lock
8 years ago
Yu Yang d3e55fde03 Guard devctx
8 years ago
Yu Yang 0023c3bcf5 Use atomic bool
8 years ago
Kexin Zhao 446d54f5c3 update
8 years ago
Kexin Zhao ffa22a5f90 fix scaling param type
8 years ago
Kexin Zhao e870947cfd fix batch norm fp16 param type
8 years ago
Yu Yang 5e87cd7574 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into cpp_parallel_executor
8 years ago
qiaolongfei a39c861530 rm unused private field in profiler
8 years ago
Kexin Zhao a13ec3432a fix test error
8 years ago
Kexin Zhao e4de5dc347 add conv2d fp16 support
8 years ago
Xin Pan d284cf88e5
Merge pull request #9037 from panyx0718/develop
8 years ago
dzhwinter 128adf53cb
[Speed]implement cudnn sequence softmax cudnn (#8978)
8 years ago
Yu Yang baef1124fb ParallelExecutor And dependency engine
8 years ago
Xin Pan 4840c49b27 Better timeline
8 years ago
QI JUN 7287630e83
Repair nccl op test (#8575)
8 years ago
Kexin Zhao c88f58dbd8 add comment
8 years ago
Kexin Zhao 3b44b849d3 address comments
8 years ago
Kexin Zhao 1998d5afa2 add gpu info func to get compute cap
8 years ago
kexinzhao 90215b7844
Add float16 GEMM math function on GPU (#8695)
8 years ago
Yiqun Liu fecc9a38c6
Add test for nested RecordEvent. (#8773)
8 years ago
Xin Pan a9b9ec45ab
Merge pull request #8775 from panyx0718/test2
8 years ago
Xin Pan 30e556d675 Use vlog instead.
8 years ago
Xin Pan eb46845313 Add warning
8 years ago
Yiqun Liu a032f56f7c
Add profiling information for inference example (#8748)
8 years ago
chengduo 84aea8a8a1
Merge pull request #8669 from chengduoZH/feature/concat_op
8 years ago
pzelazko-intel 8c71adaa8c MKLDNN conv2d kernel added (#8451)
8 years ago
kexinzhao 266ccaa843
Integrate float16 into data_type_transform (#8619)
8 years ago
Xin Pan f10152df78 Fix nullptr when doing nested profileing
8 years ago
Xin Pan cf6244c1b8 Improve profiler
8 years ago
chengduoZH 131ec276ed fix bug for big number; float->double and code refine
8 years ago
chengduoZH 00e596edbe get max threads of GPU
8 years ago
Xin Pan f3cbfc021c Add MEMCPY information
8 years ago
Xin Pan 55b2d3d032 Add CPU time to the timeline.
8 years ago
Xin Pan 6720198731
Merge pull request #8663 from panyx0718/test2
8 years ago
Xin Pan 12843a3a53 Firt timeline version
8 years ago
Yu Yang db77006923
Merge pull request #8657 from reyoung/feature/fix_compile
8 years ago
chengduo e9f2033175
Merge pull request #8539 from chengduoZH/feature/refine_elementwise_op_function.h
8 years ago
Yu Yang 22b5c07a7d Fix the compilation on CUDA 9.1/GCC 5.3
8 years ago
Yibing Liu c0876cf686 update due to upstream's change
8 years ago
chengduoZH 90dc33b5ff Add todo for reduceSum
8 years ago
Xin Pan 9bbce49353 Fix version date.
8 years ago
Xin Pan b9ec24c6e9 Extend current profiler for timeline and more features.
8 years ago
chengduoZH b8938b448c refine Sum
8 years ago
chengduoZH a82883922e follow comments
8 years ago
QI JUN 44e3015412
fix nccl version (#8540)
8 years ago
Yu Yang d50016b2a7 Remove build warnings in float16.h (#8481)
8 years ago
dzhwinter 46e4f6ffab small fix
8 years ago
kexinzhao 74e0eb7267
make float16 a pod type (#8456)
8 years ago
Yang Yang(Tony) 87f4311a88
compile with nccl2 (#8411)
8 years ago
kexinzhao f82fa64a06
Move float16 into fluid folder (#8394)
8 years ago
qingqing01 24509f4af9 Fix the grammar in copyright. (#8403)
8 years ago
Tao Luo b56f4a4ee2 move code from /paddle/string to /paddle/fluid/string (#8363)
8 years ago
Yi Wang fc374821dd Correct #include path
8 years ago
Yi Wang 90648f336d Move file to fluid/; Edit CMakeLists.txt
8 years ago