pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2026-01-15 12:15:51 +00:00

Files

Yuchen Hao 4a751dfc20 optimize MulGradient for common shapes (#19705 )

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19705

Optimizing for a case when there's a consecutive dims that are not broadcasted followed by another consecutive dims that are broadcasted.
For example, MulGradient(["dC", "A", "B"], ["dA", "dB"], broadcast=True, axis=0) where A.shape == dC.shape == [9508, 80] and B.shape == [80] .

Test Plan:
In SKL T6,

Running mul_gradient_benchmark without this optimization
Operator #0 (dA, MulGradient) 11.9119 ms/iter

After this optimization,
Operator #0 (dA, MulGradient) 0.672759 ms/iter

Need to land D15291800 before to fix the unit test error

Reviewed By: dmudiger

Differential Revision: D15075415

fbshipit-source-id: 0f97be17cf8f1dacbafa34cd637fb8bc1c5e5387

2019-12-11 11:39:52 -08:00

docs

Fix several DeprecationWarning: invalid escape sequence (#15733 )

2019-01-05 08:53:35 -08:00

examples

Hipify contrib/nccl (#29385 )

2019-11-08 10:39:17 -08:00

helpers

Add elementwise_affine for LayerNormGradientOp (#19982 )

2019-05-03 15:33:46 -07:00

ideep

caffe2 python ideep conv_op test_int8_convolution skip for python 3

2019-10-08 21:31:11 -07:00

layers

FCTransposed to FbFCPacked (#29766 )

2019-12-10 10:18:21 -08:00

mint

…

mkl

implement operators for DNNLOWP (#18656 )

2019-04-10 12:04:39 -07:00

modeling

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

models

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

onnx

Automatic update of fbcode/onnx to c08a7b76cf7c1555ae37186f12be4d62b2c39b3b (#30619 )

2019-12-10 10:15:08 -08:00

operator_test

optimize MulGradient for common shapes (#19705 )

2019-12-11 11:39:52 -08:00

predictor

fix fc fp16 quantization (#29469 )

2019-11-18 11:26:49 -08:00

rnn

…

serialized_test

Output sequence probability with CTC beam search, optional multiple output sequences (#21927 )

2019-07-02 17:29:13 -07:00

test

Enforce import order to make protobuf cpp implementation in python work (#18560 )

2019-04-03 13:17:08 -07:00

trt

#26426 fixed (#28715 )

2019-11-01 12:53:01 -07:00

__init__.py

Revert #17191 and #17215 that no longer apply on Windows (#17567 )

2019-03-01 10:37:27 -08:00

_import_c_extension.py

Enforce import order to make protobuf cpp implementation in python work (#18560 )

2019-04-03 13:17:08 -07:00

allcompare_test.py

…

attention.py

…

benchmark_generator.py

…

binarysize.py

…

brew_test.py

…

brew.py

Testing for folded conv_bn_relu (#19298 )

2019-04-16 19:04:06 -07:00

build.py

…

cached_reader.py

Pass loop_over optional parameter for cached reader properly. (#21929 )

2019-06-19 18:15:32 -07:00

caffe_translator_test.py

Fix several ResourceWarning: unclosed file (#15746 )

2019-01-09 15:36:53 -08:00

caffe_translator.py

Fix several ResourceWarning: unclosed file (#15746 )

2019-01-09 15:36:53 -08:00

checkpoint_test.py

…

checkpoint.py

Remove setting logger level in caffe2.python.checkpoint (#19803 )

2019-05-10 07:00:58 -07:00

CMakeLists.txt

…

cnn.py

…

compatibility.py

…

context_test.py

…

context.py

…

control_ops_grad_test.py

Fix the weird bug in control_flow_op_test.py (#26931 )

2019-09-26 20:44:03 -07:00

control_ops_grad.py

DeviceScope support for CUDA and testing (#15357 )

2019-01-30 18:42:12 -08:00

control_ops_util.py

…

control_test.py

…

control.py

…

convert_test.py

…

convert.py

…

convnet_benchmarks_test.py

Skip convnets benchmark in rocm CI (#17331 )

2019-02-20 21:12:24 -08:00

convnet_benchmarks.py

…

core_gradients_test.py

Back out "Back out "[Caffe2] Fix device_option propagation"" (#25908 )

2019-09-17 04:01:36 -07:00

core_test.py

Extend Net.RunAllOnGPU() to support RecurrentNetwork op (#15713 )

2019-02-08 15:48:42 -08:00

core.py

BlobReference __getattr__ can only throw AttributeError (#26654 )

2019-09-23 13:01:00 -07:00

crf_predict.py

…

crf_viterbi_test.py

…

crf.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

data_parallel_model_test.py

Skips test_equiv_recurrent (#29255 )

2019-11-06 13:29:23 -08:00

data_parallel_model.py

skip import nccl and gloo_gpu in cpu machine (#22522 )

2019-07-10 11:56:56 -07:00

data_workers_test.py

Disables test_atomic_ops and testInputOrder (#29145 )

2019-11-05 16:53:53 -08:00

data_workers.py

…

dataio_test.py

Fix for flaky caffe2 dataio test (test_time_limit_reader_with_short_limit) (#27592 )

2019-10-10 13:53:58 -07:00

dataio.py

Rearrange stopping condition in CompositeReader (#20062 )

2019-05-06 15:06:32 -07:00

dataset.py

…

db_file_reader.py

…

db_test.py

…

device_checker.py

…

dlpack.h

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

dyndep.py

guard dyndep with a lock (#26153 )

2019-09-13 11:38:14 -07:00

embedding_generation_benchmark.py

…

experiment_util.py

…

extension_loader.py

always restore dlopen flag in dyndep (#22958 )

2019-07-17 10:26:25 -07:00

filler_test.py

caffe2 - Expose tensor filler util to Python (#18886 )

2019-04-08 11:54:10 -07:00

functional_test.py

…

functional.py

…

fused_8bit_rowwise_conversion_ops_test.py

…

gradient_check_test.py

Unify gpu_support variable in python tests (#16748 )

2019-02-07 00:29:51 -08:00

gradient_checker.py

Adding gradient to Boolean Mask operator (#21423 )

2019-06-06 20:48:47 -07:00

gru_cell.py

…

hip_test_util.py

…

hsm_util.py

…

hypothesis_test_util.py

Hypothesis tests: add ability to enforce shape inference (#23935 )

2019-08-13 05:32:41 -07:00

hypothesis_test.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

ideep_test_util.py

…

layer_model_helper.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

layer_model_instantiator.py

…

layer_parameter_sharing_test.py

Add validator for optimizers when parameters are shared

2019-04-17 21:10:38 -07:00

layer_test_util.py

…

layers_test.py

FCTransposed to FbFCPacked (#29766 )

2019-12-10 10:18:21 -08:00

lengths_reducer_fused_8bit_rowwise_ops_test.py

make the threshold for acurracy more precise (#17194 )

2019-02-20 13:14:11 -08:00

lengths_reducer_rowwise_8bit_ops_test.py

…

lstm_benchmark.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

memonger_test.py

Unify gpu_support variable in python tests (#16748 )

2019-02-07 00:29:51 -08:00

memonger.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

mkl_test_util.py

…

model_device_test.py

Unify gpu_support variable in python tests (#16748 )

2019-02-07 00:29:51 -08:00

model_helper_test.py

…

model_helper.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

modifier_context.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

mpi_python.cc

…

muji_test.py

…

muji.py

…

net_builder_test.py

…

net_builder.py

…

net_drawer.py

Allow customization of blob node in net_drawer (#16915 )

2019-02-12 15:02:50 -08:00

net_printer_test.py

…

net_printer.py

Fix spelling errors (#21665 )

2019-06-13 15:21:55 -07:00

nomnigraph_test.py

…

nomnigraph_transformations_test.py

…

nomnigraph_transformations.py

…

nomnigraph.py

…

normalizer_context.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

normalizer_test.py

…

normalizer.py

…

numa_benchmark.py

…

numa_test.py

…

observer_test.py

…

operator_fp_exceptions_test.py

Caffe2 - Add flag to fails if float point exceptions is detected in operator runs (#18040 )

2019-03-16 12:28:05 -07:00

optimizer_context.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

optimizer_test_util.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

optimizer_test.py

Build Unit Test of SparseRAdam

2019-11-18 15:22:37 -08:00

optimizer.py

Build Unit Test of SparseRAdam

2019-11-18 15:22:37 -08:00

parallel_workers_test.py

ParallelWorkersTest.testParallelWorkersInitFun is flaky (#29045 )

2019-11-01 13:59:02 -07:00

parallel_workers.py

get rid of deprecated thread.isAlive() to use py2.6 modern form is_alive()

2019-10-22 15:37:31 -07:00

parallelize_bmuf_distributed_test.py

Unify gpu_support variable in python tests (#16748 )

2019-02-07 00:29:51 -08:00

pipeline_test.py

…

pipeline.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

predictor_constants.py

…

pybind_state_dlpack.cc

…

pybind_state_dlpack.h

Remove PythonOp non-CPU path and PytorchOp (#15417 )

2019-01-02 16:36:37 -08:00

pybind_state_gpu.cc

add simple memory analyzer and log warning if GPU underutilized (#21024 )

2019-05-28 19:58:54 -07:00

pybind_state_hip.cc

Make caffe2/fb folder compatible with AMD (#29131 )

2019-11-04 16:40:29 -08:00

pybind_state_ideep.cc

Upgrade mkldnn-bridge for dnnlowp support (#16308 )

2019-04-03 12:47:17 -07:00

pybind_state_int8.cc

…

pybind_state_nomni.cc

…

pybind_state_registry.cc

…

pybind_state_registry.h

…

pybind_state.cc

Change interface from map of TensorShape to shapeInfoMap (#30802 )

2019-12-10 00:35:11 -08:00

pybind_state.h

Support unpickle py2 NetDef object in py3 (#26147 )

2019-09-18 02:02:34 -07:00

python_op_test.py

…

queue_util.py

…

record_queue.py

…

recurrent.py

…

regularizer_context.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

regularizer_test.py

Implement "trimmed lasso" regularization and support all available regularization in a single interface (#22966 )

2019-07-17 16:12:31 -07:00

regularizer.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

rnn_cell.py

…

schema_test.py

Pass LRU hash output evicted_values to SparseLookup (#21389 )

2019-07-02 11:27:37 -07:00

schema.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

scope_test.py

…

scope.py

…

session_test.py

…

session.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

sparse_to_dense_mask_test.py

Increase static tolerance for negative feature ids

2019-05-20 19:09:22 -07:00

sparse_to_dense_test.py

…

task_test.py

…

task.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

test_util.py

caffe2 - support flaky operator tests for caffe2 build (#18155 )

2019-03-25 16:58:34 -07:00

text_file_reader.py

Create Node2Vec ModuleKeeper

2019-04-01 10:36:23 -07:00

timeout_guard.py

…

toy_regression_test.py

…

transformations_test.py

Remove sinkMaxPool transformation (#17694 )

2019-03-12 20:10:46 -07:00

transformations.py

support pre-convert filter format for mkldnn training mode and change 'OptimizeForIdeep' to 'OptimizeForMkldnn' (#15171 )

2019-03-29 19:00:48 -07:00

tt_core_test.py

…

tt_core.py

…

utils_test.py

…

utils.py

Query caffe2 operator stats for detailed execution info (#20924 )

2019-06-13 23:41:04 -07:00

visualize.py

…

workspace_test.py

Revert "Revert D18171156: Merge Tensor and Variable." (#29299 )

2019-11-08 09:11:20 -08:00

workspace.py

Add option to clean up allocated activations between c2 runs (#29619 )

2019-11-13 10:30:10 -08:00