* elementwise_add with bcast: Brian's implementation by Brian added, with default bcasts
* elementwise_add with bcast: GetExpectedKernelType added to elementwise_op
* elementwise_add with bcast: use_mkldnn attribute added
* elementwise_add with bcast: changes after review and some formatting
* elementwise_add with bcast: changes after style check
* elementwise_add with bcast: changes after style check cont.
* elementwise_add with bcast: MKLDNN unittests added
* elementwise_add with bcast: original unittests with use_mkldnn flag
* elementwise_add with bcast: handling of MKLDNN format corrected
* elementwise_add with bcast: setting MKLDNN format turned into lambda
* elementwise_add with bcast: MKDNN format setting turned into separate function
* elementwise_add with bcast: condition for choosing MKLDNN simplified
* elementwise_add with bcast: fix for MKLDNN format set incorrectly in bcasts
* elementwise_add with bcast: changes in unittests for broadcasts
* elementwise_add with bcast: fixes in unittests regarding dimensions
* elementwise_add with bcast: bring back correct format setting in mklml grad path
* elementwise_add with bcast: fixed compilation error
* Add MKLDNN layout support in Paddle
Add MKLDNN layout in Paddle so that MKLDNN friendly memory layout
can be used in MKLDNN enabled OP kernel. Before this commit, NCHW
is hardcode to be used in all MKLDNN op kernels. As a result,
non-optimized execution path is selected in MKLDNN primitive which
bring worse performance.
Besides framework change, three MKLDNN OP kernels were updated
for using new MKLDNN layout. They are conv/pool2d/batch_norm.
Other MKLDNN OP kernels need be also updated in similar way to
achieve best performance.
* Add MKLDNN layout support in activation OP
* Don't populate layout from input to output when kMKLDNN in
* Refine pool mkldnn op kernel
* MKLDNN layout
* Remove the inferitance from tensor file
* MKLDNN layout: refactoring
* Remove additional #define to register new operator
* Prepare mkldnn tests to work with layout