pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2026-01-15 12:15:51 +00:00

Files

Yuchen Hao 4a751dfc20 optimize MulGradient for common shapes (#19705 )

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19705

Optimizing for a case when there's a consecutive dims that are not broadcasted followed by another consecutive dims that are broadcasted.
For example, MulGradient(["dC", "A", "B"], ["dA", "dB"], broadcast=True, axis=0) where A.shape == dC.shape == [9508, 80] and B.shape == [80] .

Test Plan:
In SKL T6,

Running mul_gradient_benchmark without this optimization
Operator #0 (dA, MulGradient) 11.9119 ms/iter

After this optimization,
Operator #0 (dA, MulGradient) 0.672759 ms/iter

Need to land D15291800 before to fix the unit test error

Reviewed By: dmudiger

Differential Revision: D15075415

fbshipit-source-id: 0f97be17cf8f1dacbafa34cd637fb8bc1c5e5387

2019-12-11 11:39:52 -08:00

__init__.py

…

activation_ops_test.py

Fix relu bug for empty tensor (#19451 )

2019-04-19 15:21:07 -07:00

adadelta_test.py

eliminate FE_INVALID in optimizer related operators and tests (#20501 )

2019-05-16 08:23:46 -07:00

adagrad_test_helper.py

add decay parameter in ref_adagrad (#15329 )

2019-05-07 18:58:58 -07:00

adagrad_test.py

remove unused parameters in optimizer tests (#18084 )

2019-03-15 18:06:15 -07:00

adam_test.py

eliminate FE_INVALID in optimizer related operators and tests (#20501 )

2019-05-16 08:23:46 -07:00

affine_channel_op_test.py

…

apmeter_test.py

…

arg_ops_test.py

…

assert_test.py

…

atomic_ops_test.py

Disables test_atomic_ops and testInputOrder (#29145 )

2019-11-05 16:53:53 -08:00

basic_rnn_test.py

…

batch_box_cox_test.py

…

batch_bucketize_op_test.py

…

batch_moments_op_test.py

…

batch_sparse_to_dense_op_test.py

Add cuda version for operators BatchSparseToDense and BatchDenseToSparse (#29166 )

2019-11-05 13:06:23 -08:00

bbox_transform_test.py

…

bisect_percentile_op_test.py

…

blobs_queue_db_test.py

caffe2 - set up correct inheritance structure for remaining operator test classes (#18622 )

2019-04-01 15:53:22 -07:00

boolean_mask_test.py

Adding gradient to Boolean Mask operator (#21423 )

2019-06-06 20:48:47 -07:00

boolean_unmask_test.py

…

box_with_nms_limit_op_test.py

fix bug of not using get_score_cls_index in BoxWithNMSLimitOp (#20868 )

2019-05-24 22:31:01 -07:00

bucketize_op_test.py

Move bucketize_op to open source

2019-05-20 18:03:27 -07:00

cast_op_test.py

…

ceil_op_test.py

…

channel_backprop_stats_op_test.py

…

channel_shuffle_test.py

batch size 0 support in ChannelShuffle DNNLOWP op (#26858 )

2019-09-26 00:40:07 -07:00

channel_stats_op_test.py

Optimize channel_stats_op (#16243 )

2019-03-12 12:08:00 -07:00

checkpoint_test.py

caffe2 - set up correct inheritance structure for remaining operator test classes (#18622 )

2019-04-01 15:53:22 -07:00

clip_op_test.py

…

clip_tensor_op_test.py

…

collect_and_distribute_fpn_rpn_proposals_op_test.py

split and register CollectAndDistributeFpnRpnProposals with C10

2019-05-16 13:40:46 -07:00

concat_split_op_test.py

…

conditional_test.py

…

conftest.py

…

conv_test.py

Unify gpu_support variable in python tests (#16748 )

2019-02-07 00:29:51 -08:00

conv_transpose_test.py

Add support for group ConvTranspose (#18794 )

2019-04-04 11:52:06 -07:00

copy_ops_test.py

caffe2 - set up correct inheritance structure for remaining operator test classes (#18622 )

2019-04-01 15:53:22 -07:00

copy_rows_to_tensor_op_test.py

Perform weight re-init for embedding table in sparse_lookup.py (#22348 )

2019-07-03 10:33:40 -07:00

cosine_embedding_criterion_op_test.py

…

counter_ops_test.py

…

crf_test.py

…

cross_entropy_ops_test.py

…

ctc_beam_search_decoder_op_test.py

Output sequence probability with CTC beam search, optional multiple output sequences (#21927 )

2019-07-02 17:29:13 -07:00

ctc_greedy_decoder_op_test.py

…

cudnn_recurrent_test.py

Unify gpu_support variable in python tests (#16748 )

2019-02-07 00:29:51 -08:00

data_couple_op_test.py

…

dataset_ops_test.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

deform_conv_test.py

no EIGEN engine for DeformConv (#16785 )

2019-02-06 11:59:31 -08:00

dense_vector_to_id_list_op_test.py

…

depthwise_3x3_conv_test.py

add NCHW2NHWC and NHWC2NCHW in utils.py (#15588 )

2018-12-28 17:34:50 -08:00

detectron_keypoints.py

…

distance_op_test.py

…

dropout_op_test.py

…

duplicate_operands_test.py

…

elementwise_linear_op_test.py

…

elementwise_logical_ops_test.py

…

elementwise_op_broadcast_test.py

Unify gpu_support variable in python tests (#16748 )

2019-02-07 00:29:51 -08:00

elementwise_ops_test.py

…

emptysample_ops_test.py

…

enforce_finite_op_test.py

…

ensure_clipped_test.py

…

ensure_cpu_output_op_test.py

…

erf_op_test.py

Export PyTorch erf to ONNX Erf and add Caffe2 Erf operator

2019-01-17 09:18:08 -08:00

expand_op_test.py

…

fc_operator_test.py

Fix spelling errors (#21665 )

2019-06-13 15:21:55 -07:00

feature_maps_ops_test.py

…

filler_ops_test.py

…

find_op_test.py

…

flatten_op_test.py

…

flexible_top_k_test.py

…

floor_op_test.py

…

gather_ops_test.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

gather_ranges_op_test.py

Remove error logging of high empty range ratio

2019-10-30 12:55:25 -07:00

given_tensor_byte_string_to_uint8_fill_op_test.py

…

given_tensor_fill_op_test.py

Add GivenTensorInt16Fill (#20515 )

2019-05-15 19:45:15 -07:00

glu_op_test.py

…

group_conv_test.py

omit group conv NHWC test for GPU (#17715 )

2019-03-06 11:32:35 -08:00

group_norm_op_test.py

Optimize channel_stats_op (#16243 )

2019-03-12 12:08:00 -07:00

gru_test.py

…

heatmap_max_keypoint_op_test.py

2019-08-27 20:13:57 -07:00

hsm_test.py

…

hyperbolic_ops_test.py

…

im2col_col2im_test.py

…

image_input_op_test.py

…

index_hash_ops_test.py

Support in-place update in IndexHashOp (#30275 )

2019-11-22 14:49:28 -08:00

index_ops_test.py

…

instance_norm_test.py

Optimize InstanceNormGradientOp (#22288 )

2019-07-01 15:10:17 -07:00

integral_image_ops_test.py

…

jsd_ops_test.py

…

key_split_ops_test.py

…

lars_test.py

Add missing shebangs to Python files with executable permissions.

2019-06-06 10:53:40 -07:00

layer_norm_op_test.py

Add elementwise_affine for LayerNormGradientOp (#19982 )

2019-05-03 15:33:46 -07:00

leaky_relu_test.py

add NCHW2NHWC and NHWC2NCHW in utils.py (#15588 )

2018-12-28 17:34:50 -08:00

learning_rate_adaption_op_test.py

…

learning_rate_op_test.py

fix composite learning rate (#26227 )

2019-09-18 17:34:17 -07:00

length_split_op_test.py

…

lengths_pad_op_test.py

…

lengths_tile_op_test.py

…

lengths_top_k_ops_test.py

…

listwise_l2r_operator_test.py

add LambdaRank DCG Loss Option (#23679 )

2019-08-02 11:47:46 -07:00

load_save_test.py

Avoid Output Uninitialized Blobs in Load with load_all=1 (#19133 )

2019-04-27 10:45:44 -07:00

locally_connected_op_test.py

add NCHW2NHWC and NHWC2NCHW in utils.py (#15588 )

2018-12-28 17:34:50 -08:00

loss_ops_test.py

…

lpnorm_op_test.py

…

map_ops_test.py

…

margin_ranking_criterion_op_test.py

…

math_ops_test.py

eliminate FE_INVALID in unit test (#20502 )

2019-05-16 21:55:28 -07:00

matmul_op_test.py

…

mean_op_test.py

…

merge_id_lists_op_test.py

…

mkl_conv_op_test.py

…

mkl_packed_fc_op_test.py

…

mkl_speed_test.py

…

mod_op_test.py

…

moments_op_test.py

…

momentum_sgd_test.py

Unify gpu_support variable in python tests (#16748 )

2019-02-07 00:29:51 -08:00

mpi_test.py

…

mul_gradient_benchmark.py

optimize MulGradient for common shapes (#19705 )

2019-12-11 11:39:52 -08:00

negate_gradient_op_test.py

…

ngram_ops_test.py

…

normalize_op_test.py

…

numpy_tile_op_test.py

…

one_hot_ops_test.py

…

onnx_while_test.py

…

order_switch_test.py

add NCHW2NHWC and NHWC2NCHW in utils.py (#15588 )

2018-12-28 17:34:50 -08:00

pack_ops_test.py

Support full id interations (#28769 )

2019-10-29 14:56:18 -07:00

pack_rnn_sequence_op_test.py

…

pad_test.py

…

partition_ops_test.py

…

percentile_op_test.py

…

piecewise_linear_transform_test.py

…

pooling_test.py

Fix typos (#30606 )

2019-12-02 20:17:42 -08:00

prepend_dim_test.py

…

python_op_test.py

…

rand_quantization_op_speed_test.py

…

rand_quantization_op_test.py

Disables flaky test_rand_quantization (#29463 )

2019-11-08 13:51:22 -08:00

rank_loss_operator_test.py

…

rebatching_queue_test.py

…

record_queue_test.py

…

recurrent_net_executor_test.py

caffe2 - set up correct inheritance structure for remaining operator test classes (#18622 )

2019-04-01 15:53:22 -07:00

recurrent_network_test.py

…

reduce_ops_test.py

…

reduction_ops_test.py

…

reshape_ops_test.py

ReshapeOp supports empty tensor (#21230 )

2019-06-06 15:02:11 -07:00

resize_op_test.py

Add NHWC support to Resize Operator (#15553 )

2019-01-08 16:44:17 -08:00

rmac_regions_op_test.py

…

rnn_cell_test.py

…

roi_align_rotated_op_test.py

…

scale_op_test.py

ScaleBlobs Operator (#19660 )

2019-05-08 17:57:33 -07:00

segment_ops_test.py

…

selu_op_test.py

…

sequence_ops_test.py

…

shape_inference_test.py

shape inference for learning rate op (#20020 )

2019-05-14 23:34:32 -07:00

sinusoid_position_encoding_op_test.py

…

softmax_ops_test.py

Support softmax with D == 0 (#29167 )

2019-11-11 00:46:10 -08:00

softplus_op_test.py

…

sparse_dropout_with_replacement_op_test.py

Implement dropout with replacement for id list features. (#22880 )

2019-07-23 14:34:21 -07:00

sparse_gradient_checker_test.py

…

sparse_lengths_sum_benchmark.py

add options to flush cache in SLS benchmarks (#25530 )

2019-09-03 05:09:03 -07:00

sparse_normalize_test.py

Make SparseNormalize backwards compatible (#25660 )

2019-09-05 15:14:21 -07:00

sparse_ops_test.py

…

sparse_to_dense_mask_op_test.py

…

spatial_bn_op_test.py

Unify gpu_support variable in python tests (#16748 )

2019-02-07 00:29:51 -08:00

specialized_segment_ops_test.py

…

square_root_divide_op_test.py

…

stats_ops_test.py

…

stats_put_ops_test.py

…

string_ops_test.py

…

text_file_reader_test.py

…

thresholded_relu_op_test.py

…

tile_op_test.py

…

top_k_test.py

…

torch_integration_test.py

Expose PiecewiseLinearTransform to PyTorch

2019-09-27 12:49:04 -07:00

transpose_op_test.py

Fix bug in caffe2 transpose on GPU (#22233 )

2019-06-26 11:33:25 -07:00

trigonometric_op_test.py

…

unique_ops_test.py

Enable arg_ops_test/unique_ops_test on AMD/rocm (#16853 )

2019-02-07 16:51:15 -08:00

unique_uniform_fill_op_test.py

…

upsample_op_test.py

…

utility_ops_test.py

…

video_input_op_test.py

update video input (#22471 )

2019-07-05 00:56:33 -07:00

weighted_multi_sample_test.py

…

weighted_sample_test.py

…

weighted_sum_test.py

…

wngrad_test.py

remove unused parameters in optimizer tests (#18084 )

2019-03-15 18:06:15 -07:00