Commit Graph

2578 Commits

Author SHA1 Message Date
Yuchen Hao
4a751dfc20 optimize MulGradient for common shapes (#19705)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19705

Optimizing for a case when there's a consecutive dims that are not broadcasted followed by another consecutive dims that are broadcasted.
For example, MulGradient(["dC", "A", "B"], ["dA", "dB"], broadcast=True, axis=0) where A.shape == dC.shape == [9508, 80] and B.shape == [80] .

Test Plan:
In SKL T6,

Running mul_gradient_benchmark without this optimization
Operator #0 (dA, MulGradient) 11.9119 ms/iter

After this optimization,
Operator #0 (dA, MulGradient) 0.672759 ms/iter

Need to land D15291800 before to fix the unit test error

Reviewed By: dmudiger

Differential Revision: D15075415

fbshipit-source-id: 0f97be17cf8f1dacbafa34cd637fb8bc1c5e5387
2019-12-11 11:39:52 -08:00
Summer Deng
a42d093db2 FCTransposed to FbFCPacked (#29766)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29766

Add FbgemmPackTranspose op to support the packing on FCTransposed weights

Add FCTransposed to FbFCPacked transformation to Dper fp16 exporter

Test Plan:
```
buck test mode/opt caffe2/caffe2/fb/fbgemm:fb_fc_packed_op_test
```

```
buck test mode/opt caffe2/caffe2/python:layers_test
```

Differential Revision: D18482306

fbshipit-source-id: e8f1947b3d0d04892293509ebf88742f5f0f5997
2019-12-10 10:18:21 -08:00
Lu Fang
c34ef1aa2e Automatic update of fbcode/onnx to c08a7b76cf7c1555ae37186f12be4d62b2c39b3b (#30619)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30619

Previous import was fea8568cac61a482ed208748fdc0e1a8e47f62f5

Included changes:
- **[c08a7b76](https://github.com/onnx/onnx/commit/c08a7b76)**: doc: fix some typos at ONNXIFI (#2473) <Yorkie Liu>
- **[4be12d46](https://github.com/onnx/onnx/commit/4be12d46)**: remove workshop update since it is done (#2460) <Prasanth Pulavarthi>
- **[86107d1b](https://github.com/onnx/onnx/commit/86107d1b)**: Updated with correct URL to LICENSE (#2468) <Ryan Loney>
- **[9bf6fbb6](https://github.com/onnx/onnx/commit/9bf6fbb6)**: Update Argmin/Argmax (#2461) <Lara Haidar>
- **[748d81b8](https://github.com/onnx/onnx/commit/748d81b8)**: Fix windows conda build (#2452) <Ashwini Khade>
- **[a32db1c5](https://github.com/onnx/onnx/commit/a32db1c5)**: Delete duplicate word in comment (#2439) <Haibo Hao>
- **[e108da9a](https://github.com/onnx/onnx/commit/e108da9a)**: Fix bug in function body verifier (#2390) <G. Ramalingam>
- **[c3d3ef82](https://github.com/onnx/onnx/commit/c3d3ef82)**: docs: fix typo in IR.md (#2441) <Elliot Waite>

Test Plan: ci

Reviewed By: hl475

Differential Revision: D18766132

fbshipit-source-id: 13c04f21399579acb87a8f9fac2e4c329b0720b8
2019-12-10 10:15:08 -08:00
Chunli Fu
42324cb6e8 Change interface from map of TensorShape to shapeInfoMap (#30802)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30802

Change shape_hints from map<string, TensorShape> to ShapeInfoMap to catch dimType info from model file.

Reviewed By: ipiszy

Differential Revision: D18821486

fbshipit-source-id: c5d9ed72e158d3698aba38900aeda00f776745b4
2019-12-10 00:35:11 -08:00
Supriya Rao
a51c5f5cbf Add JIT pass to insert permutes for conv ops (#30679)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30679

Caffe2 expects quantized ops to be in NHWC format while pytorch inputs are in NCHW.
Add a jit pass to insert permutes to convert from nchw2nhwc before each conv op and add nhwc2nchw permute after the conv op.
Using graph rewriter to find consecutive redundant permutes and remove them from the graph

Test Plan:
python test/onnx/test_pytorch_onnx_caffe2_quantized.py TestQuantizedOps

Imported from OSS

Differential Revision: D18790518

fbshipit-source-id: 4dd39cf0b31b21f5586c0edfdce2260d4e245112
2019-12-05 18:51:16 -08:00
Brian Wignall
e7fe64f6a6 Fix typos (#30606)
Summary:
Should be non-semantic.

Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30606

Differential Revision: D18763028

Pulled By: mrshenli

fbshipit-source-id: 896515a2156d062653408852e6c04b429fc5955c
2019-12-02 20:17:42 -08:00
Chuan Jiang
6c9b188262 Support in-place update in IndexHashOp (#30275)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30275

`IndexHash` did not support in-place update.

Reviewed By: kennyhorror

Differential Revision: D18612231

fbshipit-source-id: adeccdf1ceb6107454555ff9cdf66fd5e5773f2a
2019-11-22 14:49:28 -08:00
Mengshi Zhang
5b6dd52e3c Build Unit Test of SparseRAdam
Summary: We added caffe2 python wrapper and unit test for the SparseRAdam C++ operator.

Test Plan:
Unit test is constructed following the design pattern of [Wngrad optimizer](https://our.intern.facebook.com/intern/diff/D8655724/). Test passed smoothly.
buck test //caffe2/caffe2/python:optimizer_test -- TestSparseRAdam

Test result:
{F221144048}

Reviewed By: wx1988

Differential Revision: D18330650

fbshipit-source-id: e0f4724c2b616b665e2a0fe2e5c3430696cca7ee
2019-11-18 15:22:37 -08:00
Lei Zhang
b45069b59f fix fc fp16 quantization (#29469)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29469

The original approach is to save both fp16 and fp32 for all models, which increased the filesize and memory.

This diff is to save 'used' blobs into predictor file.

Test Plan:
fc clone workflow :
f149878151

ctr mbl feed test with fc fp16 quantization:
f149996395

No fp32 in local file
{F221750392}

QRT after the fix:
https://fburl.com/qrt/cp8r8263

Reviewed By: wx1988

Differential Revision: D18382503

fbshipit-source-id: 231c41668f25b1d35ca8d4358ce9b12ba60a4f91
2019-11-18 11:26:49 -08:00
James Reed
7a6c3b36a1 Switch ScriptModuleOp to use a unique_ptr
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29856

Test Plan: waitforsadcastle

Reviewed By: dzhulgakov

Differential Revision: D18516553

fbshipit-source-id: d1e2d49ec613d07b21cd30bd777fbd300032cba1
2019-11-14 19:36:00 -08:00
Yangxin Zhong
ed788ec780 Linearizable Label: Class Weights, Allow Missing Label, and Average by Batch Size (#29707)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29707

In D17885977, Linearizable label (a multi-class classification) was implemented in MTML.

In this diff, we add several items for Linearizable label:

- Assigning different weights to each class through ```model_def.tasks[i].class_weights```.

  - This option is a dictionary, the keys of which are indices of the classes and the values of which are weights for each class.

  - For example, if a linearizable-label task has 4 classes and its ```class_weights = {"0": 1, "1": 0.1, "2": 0.1, "3": 0.01}```, it means that in the loss function of this task, we assign weight 1 to its first class, weight 0.1 to its second and third class, and weight 0.01 to its forth class. The index/order of classes follows the logic of linearizable label.

  - Note that when you assign different weights to different classes, you need to correct the calibration by setting an appropriate ```model_def.tasks[i].calibration.linearizable_class_weight```. Basically, the class weights in calibration should be the reciprocals of the class weights in loss function. So the ```calibration.linearizable_class_weight = {"0": 1, "1": 10, "2": 10, "3": 100}``` for the example above.

  - Example FBLearner job: f150763093

- We also support ```model_def.allow_missing_label_with_zero_weight``` for linearizable label, which will ignore those examples with first label missing, by assigning zero weights to them in loss function.

  - We need to set ```allow_missing_label_with_zero_weight = true``` to enable it.

  - Example FBLearner job: f150763093

- Last but not least, we update caffe2 operator ```SoftmaxWithLoss``` to support loss averaged by batch size.

  - We need to set ```model_def.tasks[i].loss.softmaxLoss.average_by_batch_size = true``` to enable it.

  - Previously, the loss was averaged by weight sum of examples in batch, which is still the default behavior now (when ```average_by_batch_size = null``` or ```average_by_batch_size = false```).

  - Without this new feature, the calibration will be incorrect when applying non-equal-weight training among different classes to a linearizable task.

  - Example FBLearner job with ```average_by_batch_size = true``` results in a correct calibration: f150763093

  - Example FBLearner job with ```average_by_batch_size = null``` results in an incorrect calibration: f150762990

Test Plan:
buck test caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_linearizable_label_task_with_class_weights
buck test caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_linearizable_label_task_with_zero_weight
buck test caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_linearizable_label_task_average_by_batch_size

All tests passed.

full canary: https://fburl.com/fblearner/troznfgh

Reviewed By: chenshouyuan

Differential Revision: D18461163

fbshipit-source-id: aaf3df031406ae94f74e2e365b57e47409ef0bfe
2019-11-13 16:52:27 -08:00
Yinghai Lu
f0dd7517f2 Add option to clean up allocated activations between c2 runs (#29619)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29619

att

Reviewed By: houseroad

Differential Revision: D18415190

fbshipit-source-id: 739aaf436578fac635df10de42b35e2b4368df37
2019-11-13 10:30:10 -08:00
Huan Gui
be757957ba Support softmax with D == 0 (#29167)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29167

As titled.

This fix is crucial as multi_channel splitting would create history that has no items (i.e., D == 0), which leads to flow failure.

Test Plan:
Unittest

flow test:

before fix: f148783160

after fix: f149082299

buck test mode/dev-nosan caffe2/caffe2/python/operator_test:softmax_ops_test

Reviewed By: xianjiec

Differential Revision: D18296081

fbshipit-source-id: e0bb2dc2c4e5b465e213f31e5c5ced3a7e1fd574
2019-11-11 00:46:10 -08:00
Mike Ruberry
991c2ac383 Disables flaky test_rand_quantization (#29463)
Summary:
See https://github.com/pytorch/pytorch/issues/28550.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29463

Differential Revision: D18405669

Pulled By: mruberry

fbshipit-source-id: 2984c3896a9260a06fbf052afb06e0cb8d28b53d
2019-11-08 13:51:22 -08:00
Xiaodong Wang
36b73d5a1b Hipify contrib/nccl (#29385)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29385

hipify contrib/gloo

Test Plan: OSS & sandcastle build

Reviewed By: bddppq

Differential Revision: D18373308

fbshipit-source-id: 39c232db36318af116c341f64d03642639575ecd
2019-11-08 10:39:17 -08:00
Edward Yang
4e21157e01 Revert "Revert D18171156: Merge Tensor and Variable." (#29299)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29299

This reverts commit 9c43b16df9, but also
with the changes from D18348622.  Comments there:

thpp-compatibility is used by admarket/adreview/service:adreviewservice and
libtorch is too big for the service to deal with.

thpp-compatibility doesn't support autograd, so we hack around dispatching
variables by using AutoNonVariableTypeMode everywhere we call into ATen,
so we never attempt to call into Variable stubs.  If you get it wrong,
you'll get an error like:

```
what():  Could not run 'aten::empty' with arguments from the 'VariableTensorId' backend. 'aten::empty' is only available for these backends: [SparseCPUTensorId, CPUTensorId, MkldnnCPUTensorId]. (lookup_ at caffe2/aten/src/ATen/core/dispatch/DispatchTable.h:298)
```

Test Plan:
Imported from OSS

```
buck test //thpp-compatibility/...
buck build mode/opt-clang admarket/adreview/service:adreviewservice
```

adreviewservice canary: https://our.intern.facebook.com/intern/ads/canary/422290029716387895 (comparing against parent comment due to current breakage) ==> experiment store https://our.intern.facebook.com/intern/experiment_store/experiment/43990006/
adfinder canary: https://our.intern.facebook.com/intern/ads/canary/422268535840333934
adindexer canary: https://our.intern.facebook.com/intern/ads/canary/422268550559034675

adreview second canary:  https://our.intern.facebook.com/intern/ads/canary/422307863515591925

canary without thpp-compat fixups https://our.intern.facebook.com/intern/ads/canary/422308951649168772

Reviewed By: dreiss

Differential Revision: D18353504

Pulled By: ezyang

fbshipit-source-id: 65feaba39fa07bb66762810909aeb38868668a30
2019-11-08 09:11:20 -08:00
Mike Ruberry
74b2d9ed2e Skips test_equiv_recurrent (#29255)
Summary:
This test is flaky, per issue https://github.com/pytorch/pytorch/issues/10322.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29255

Differential Revision: D18350782

Pulled By: mruberry

fbshipit-source-id: 53a7d33e17428c2484211618cb71e870ce2d6a03
2019-11-06 13:29:23 -08:00
Edward Yang
9c43b16df9 Revert D18171156: Merge Tensor and Variable.
Test Plan: revert-hammer

Differential Revision:
D18171156

Original commit changeset: 5b6a045beba3

fbshipit-source-id: f5581d902c2305018ea49f8473592be2a465560b
2019-11-06 10:57:00 -08:00
Mike Ruberry
2f2a0d1607 Disables test_atomic_ops and testInputOrder (#29145)
Summary:
These tests have been flaky for some time, see:

- https://github.com/pytorch/pytorch/issues/28179
- https://github.com/pytorch/pytorch/issues/9064

This PR disables them. The actual tests were added/updated 2+ years ago. It's unclear who, if anyone, would own them now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29145

Differential Revision: D18327937

Pulled By: mruberry

fbshipit-source-id: d02731d662aff3545b581272e5ae8db4e3097d87
2019-11-05 16:53:53 -08:00
Huan Gui
8a2dcff189 Add cuda version for operators BatchSparseToDense and BatchDenseToSparse (#29166)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29166

As titled

Test Plan:
unittest

 buck test  mode/dev-nosan  caffe2/caffe2/python/operator_test:batch_sparse_to_dense_op_test

Reviewed By: xianjiec

Differential Revision: D18197966

fbshipit-source-id: 7486300c509dd552ddb7484c2d83099f62878278
2019-11-05 13:06:23 -08:00
Kevin Chen
1189f559cc Creating new layer FCWithBootstrap used in bootstrapping uncertainty approach (#29152)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29152

Bootstrapping uncertainty approach: bootstrap the last layer before the last fully-connected layer. FCWithBootstrap is a new layer to handle the logic for the bootstrapping process.

Goal:
- return a struct with the bootstrapped indices and bootstrapped predictions from this layer
- separate the functionality in the train_net and eval_net
- save the bootstrapped FC in this object so that the eval_net can use them during prediction time

Reviewed By: wx1988

Differential Revision: D17822429

fbshipit-source-id: 15dec501503d581aeb69cb9ae9e8c3a3fbc7e7b5
2019-11-04 21:18:15 -08:00
Kevin Chen
56f7415795 L0 norm approx with budget (#29155)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29155

Update the L0 norm regularizer with a budget feature to penalize features over this limit

Formula and summary:

{F212248495}

Test Plan: * Unit test located in: ~/fbsource/fbcode/caffe2/caffe2/fb/dper/layer_models/tests/split_1/fsparse_nn_test.py

Reviewed By: un-disclosed, wx1988

Differential Revision: D17458138

fbshipit-source-id: 2ed9ce6f55573b0bfc0fefbfd392f90c7542a0fd
2019-11-04 21:09:53 -08:00
Xiaodong Wang
cb72c9f5b1 Make caffe2/fb folder compatible with AMD (#29131)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29131

caffe2_pb2.CUDA --> workspace.GpuDeviceType
workspace.NumCudaDevices() --> workspace.NumGpuDevices()

Also added the totalGlobalMem into get_device_properties(), which is needed by multi_gpu_utils.py

Test Plan:
sandcastle

f148921769

Reviewed By: bddppq

Differential Revision: D18290090

fbshipit-source-id: bde7c175d1fb6ff59a062266c1b17de39d113b24
2019-11-04 16:40:29 -08:00
Edward Yang
25261a4776 Merge Tensor and Variable. (#28620)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28620

All Tensors are Variables now, they just happen to have requires_grad=False. Tensors ALWAYS have `VariableTensorId` in their type set.

When constructing this patch, I had to make decisions about what I would fix in this patch, and what I would leave for follow up PRs. Here is the cleanup that happens in this patch:

- The `is_variable` property is removed from TensorOptions. I removed this immediately because unlike Tensor::is_variable, TensorOptions::is_variable doesn't respect our VariableTensorId thread-local state. This means that there were a bunch of places where TensorOptions::is_variable was false, which is obviously bogus in the world when tensor and variable are merged. Instead of keeping the method as a function that always returns true, I just opted to remove it entirely (it's not public API.) All places we set `is_variable` are deleted.
  - Knock on effect: there is no longer a separate DeprecatedTypeProperties for the variable and non-variable versions of type.
  - Knock on effect: instead of asserting on TensorOptions::is_variable, instead we just test `at::impl::variable_is_excluded()`
- There is now only one copy of the cuDNN RNN dropout cache, not two (I'm not sure why we had two to begin with)

Some cleanup that doesn't happen in this patch:
- Eliminating unnecessary uses of `make_variable`
- Eliminating `Tensor::is_variable`

The most subtle part of this patch is retaining tracing behavior: the fact that everything is a Variable means that more code gets routed to VariableType than before; this can change traces. I identified two places where we didn't appropriately turn off VariableType, mostly factory functions:

- `torch.tensor` must turn off VariableType before invoking `at::empty` to construct the tensor, as it subsequently does direct data access
- `tensor_slow` (invoked when you pass a Python scalar to a tensor argument) must turn off VariableType before calling `scalar_to_tensor` so the scalar gets traced as constant, rather than as a call to `scalar_to_tensor`.

Honestly, these are all giant hacks, and should be replaced with a more specialized guard that just toggles tracing.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: dreiss

Differential Revision: D18171156

Pulled By: ezyang

fbshipit-source-id: 5b6a045beba37492647e350190f495114e86504d
2019-11-04 14:59:57 -08:00
Kevin Wilfong
cddda17394 ParallelWorkersTest.testParallelWorkersInitFun is flaky (#29045)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29045

Addressing an issue seen in GitHub https://github.com/pytorch/pytorch/issues/28958

It seems sometimes the workers in this test don't stop cleanly.  The purpose of this test is to check that the init_fun in init_workers works as expected, which is captured by the assertEqual in the for loop in the test.  The behavior of stop() is not really important here.

The fact it's returning false is probably indicative that a worker is getting blocked but that doesn't affect the correctness of the test.

Test Plan: Ran the test 100 times, it consistently succeeds.

Reviewed By: akyrola

Differential Revision: D18273064

fbshipit-source-id: 5fdff8cf80ec7ba04acf4666a3116e081d96ffec
2019-11-01 13:59:02 -07:00
Sergei Nikolaev
1e2049c566 #26426 fixed (#28715)
Summary:
This is the fix for reverted https://github.com/pytorch/pytorch/issues/26426
houseroad bddppq soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28715

Reviewed By: hl475

Differential Revision: D18146731

Pulled By: houseroad

fbshipit-source-id: 247366451a6334e84df82d00339521f797b33130
2019-11-01 12:53:01 -07:00
Xinyi Zhang
5821b9bf0f Remove error logging of high empty range ratio
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28854

Reviewed By: xianjiec

Differential Revision: D18206695

fbshipit-source-id: 4ce471f0236b2ceaf54ba1b1ce96e193feca720b
2019-10-30 12:55:25 -07:00
Huayu Li
793e2914e4 Support full id interations (#28769)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28769

Support full id interaction.

Test Plan:
* unit-tests
  * buck test caffe2/caffe2/python/operator_test:pack_ops_test --
  * buck test caffe2/caffe2/fb/dper/layer_models/tests:sparse_nn_attention_test -- test_sparse_nn_full_id

* canary
  * apply SUM + full id with max_length as 20 on SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID: f147253340 (v1: f146340704)

# of embeddings for this features is 20:
{F219139816}

The corresponding ops: two lookups, which is as expected.
```
op {
  input: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_0/Repeat_0/sparse_lookup/w"
  input: "feature_preproc/output_features:SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM:values"
  input: "feature_preproc/output_features:SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM:lengths"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_0/Repeat_0/sparse_lookup/output"
  name: ""
  type: "SparseLengthsSum"
}
op {
  input: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/sparse_lookup/w"
  input: "feature_preproc/output_features:SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM:values"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/sparse_lookup/output"
  name: ""
  type: "Gather"
}
op {
  input: "feature_preproc/output_features:SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM:lengths"
  input: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/sparse_lookup/output"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/PackSegments/embedding_packed"
  name: ""
  type: "PackSegments"
  arg {
    name: "max_length"
    i: 20
  }
  arg {
    name: "pad_minf"
    i: 0
  }
}
op {
  input: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/PackSegments/embedding_packed"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/Reshape/reshaped_record"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/Reshape/old_shape"
  name: ""
  type: "Reshape"
  arg {
    name: "shape"
    ints: -1
    ints: 1280
  }
}
op {
  input: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/Reshape/reshaped_record"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_0"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_1"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_2"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_3"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_4"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_5"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_6"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_7"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_8"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_9"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_10"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_11"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_12"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_13"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_14"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_15"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_16"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_17"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_18"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_19"
  name: ""
  type: "Split"
  arg {
    name: "axis"
    i: 1
  }
}
```

Reviewed By: chonglinsun

Differential Revision: D18083520

fbshipit-source-id: f592fb7734dd4e3e712ba42dc0afcd0b32a4afa0
2019-10-29 14:56:18 -07:00
Xinyi Zhang
f5ea2ca34a Reduce logging frequency for empty range tolarence
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28704

Reviewed By: xianjiec

Differential Revision: D18138828

fbshipit-source-id: 4f3c376502cb6e30b931217702c4ca537c9eb644
2019-10-28 09:52:17 -07:00
Lu Fang
c89340f068 Extend HasElements to support multiple inputs (#28717)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28717

Make HasElements support multiple inputs. Any input has element, then return true.

Test Plan: to be added

Reviewed By: BIT-silence

Differential Revision: D17972759

fbshipit-source-id: 3ecdea74a30fcfaaa6490fef1debc6cde68db922
2019-10-27 23:00:07 -07:00
Junjie Bai
d37c2d7c8d Revert D17495965: TensorRT 6.0 support and PyTorch->ONNX->TRT6 unit test
Test Plan: revert-hammer

Differential Revision:
D17495965

Original commit changeset: 3e8dbe8943f5

fbshipit-source-id: d47fcbec22b0d61df41d7dbf15cfdde196ac818f
2019-10-25 13:58:16 -07:00
Sergei Nikolaev
4996e3aca2 TensorRT 6.0 support and PyTorch->ONNX->TRT6 unit test (#26426)
Summary:
This PR makes Caffe2 compatible with TensorRT 6. To make sure it works well, new unit test is added. This test checks PyTorch->ONNX->TRT6 inference flow for all classification models from TorhchVision Zoo.
Note on CMake changes: it has to be done in order to import onnx-tensorrt project. See https://github.com/pytorch/pytorch/issues/18524 for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26426

Reviewed By: hl475

Differential Revision: D17495965

Pulled By: houseroad

fbshipit-source-id: 3e8dbe8943f5a28a51368fd5686c8d6e86e7f693
2019-10-25 13:01:57 -07:00
Xinyi Zhang
2f16284231 change empty range tolorrance logging
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28489

Differential Revision: D18067322

fbshipit-source-id: 2096d1cce820f4ebe28db0045a2ddacc022e07da
2019-10-23 09:39:39 -07:00
Jason Fried
9705d60a2f get rid of deprecated thread.isAlive() to use py2.6 modern form is_alive()
Summary:
Codemod to remove all thread.isAlive() since it throws a warning that is breaking some tests that monitor the output of their cli's

is_alive() was added in python 2.6 this is super safe

This is a codemod I don't care if the code supports python3, just that its python code

Test Plan: unittests

Reviewed By: cooperlees

Differential Revision: D18069520

fbshipit-source-id: 4ca4dcb541c0b0debeb194aba5d060152ad0ef0e
2019-10-22 15:37:31 -07:00
Jiyan Yang
07a181da1d Add more logging in net modifier
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28327

Test Plan:
Failed as expected and the full protobuf is logged
f145060005

Reviewed By: ffjiang, wx1988

Differential Revision: D17975560

fbshipit-source-id: 5375acffc1f9dede16622b06eb58b6c3a26ebe5a
2019-10-21 17:53:00 -07:00
Xinyi Zhang
06bb74ce96 Tolerate small amount of embedding corruptions
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28371

Reviewed By: xianjiec

Differential Revision: D18031155

fbshipit-source-id: a51d2a62a919f032dc04372b30cf9071aa2dd629
2019-10-21 16:23:25 -07:00
Jiang Wu
29f56eb920 Revert D17937850: Tolerate small amount of embedding corruptions
Test Plan: revert-hammer

Differential Revision:
D17937850

Original commit changeset: e9c633768d98

fbshipit-source-id: 5c2c837c7867504392b19965d91a60cadd3b8101
2019-10-19 14:17:01 -07:00
Xinyi Zhang
ca6ba06f95 Tolerate small amount of embedding corruptions
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28299

Reviewed By: Wakeupbuddy

Differential Revision: D17937850

fbshipit-source-id: e9c633768d9819fd734ddd59017c33688ebbdcca
2019-10-18 14:59:06 -07:00
Peiyao Zhou
46fefc98e2 Change dper3 loss module to match dper2 (#28265)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28265

Fix the difference in dper3 and dper2 when regressionLoss is used.

Test Plan:
test using dper2 model id f134632386
Comparison tool output before change:
```
FOUND OP DIFFERENT WITH DPER2!!!
OP is of type ExpandDims
OP inputs ['supervision:label']
OP outputs ['sparse_nn/regression_loss/mean_squared_error_loss/ExpandDims:0']
===============================
Finished all dper3 ops, number of good ops 11, bad ops 1, skipped 26
run_comparison for dper2 / dper3 nets running time: 0.0020143985748291016
result type: <class 'NoneType'> result: None
```

After change:

```
FOUND OP DIFFERENT WITH DPER2!!!
OP is of type ExpandDims
OP inputs ['sparse_nn_2/regression_loss_2/mean_squared_error_loss_8/Squeeze:0_grad']
OP outputs ['sparse_nn_2/over_arch_2/linear_2/FC_grad']
===============================
Finished all dper3 ops, number of good ops 19, bad ops 1, skipped 16
run_comparison for dper2 / dper3 nets running time: 0.0017991065979003906
result type: <class 'NoneType'> result: None
```

dper2  label part of net P111794577
dper3  label part of net after change P116817194

Reviewed By: kennyhorror

Differential Revision: D17795740

fbshipit-source-id: 9faf96f5140f5a1efdf2985820bda3ca400f61fa
2019-10-18 10:08:38 -07:00
Long Jin
76bf8f62f7 fix loss_weight for self_supervision
Summary: previously loss_weight is not used correctly for self-supervision branch

Test Plan: buck test mode/dev-nosan //caffe2/caffe2/fb/dper/layer_models/models/experimental/tests:tum_test

Reviewed By: xianjiec

Differential Revision: D17862312

fbshipit-source-id: 554b793a5caa3886946c54333c81a0d8a10230d9
2019-10-15 10:40:48 -07:00
Alyssa Wang
4b1096c652 Fix predict net issue with LRU hash eviction
Summary:
We are seeing error "[enforce fail at BlackBoxPredictor.cpp:134] ! !parameter_workspace->HasBlob(out). Net REMOTE of type predict_net writes to blob cat/NGRAM_QRT_VERSIONS_x_EVENT_TYPE_AUTO_FIRST_X/Pool_Option_0/Repeat_0/sparse_lookup/w which exists in the parameter workspace" in online testing for calibration models.
I'm suspecting it's due to the op CopyRowsToTensorOp are being used in prediction

Test Plan:
f143080108 offline predict net does not contain CopyRowsToTensorNet, which looks right.
Waiting for Olga to test online behavior
dper2 canary:
https://fburl.com/fblearner/sv3o3yj1

Differential Revision: D17741823

fbshipit-source-id: 19721b632b5ea9ebfa1ef9ae0e99d3a10c926287
2019-10-14 16:08:14 -07:00
Benny Chen
d23d62cb1e Fix unaries to export fp16 instead of fp32 when rest of the model export to int8
Summary: Currently accelerators does not have the concept for fp32, it only has understandings of fp16 and int8 in terms of data input. In order to fixe the issue here, we want to make sure unaries are turned into fp16 when we have the int8 exporter turned on.

Reviewed By: kennyhorror

Differential Revision: D17743791

fbshipit-source-id: 7322d23eb12ac3f813b525fc0ddd066f95c8ca85
2019-10-14 10:51:17 -07:00
Lei Zhang
0e8d4836e4 add feature name into module and update position weighted to match dper2
Test Plan:
The notebook showed no diff for id score list
https://our.intern.facebook.com/intern/anp/view/?id=154764

Reviewed By: alyssawangqq

Differential Revision: D17649974

fbshipit-source-id: 84cb4ae372fc215295c2d0b139d65f4eacafae4a
2019-10-14 08:06:19 -07:00
Kevin Chen
275dfa3485 Initial commit for L0 norm approx (#27756)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27756

Implement approximate L0 norm for use in the dense feature regularizer that will be used for feature importance. The formula is as follows:
{F212246801}

Reviewed By: wx1988

Differential Revision: D17432708

fbshipit-source-id: 57d6c9c3dd1b4e210b9f10264075c57dbc9c8cb6
2019-10-11 11:24:34 -07:00
Kutta Srinivasan
415b17e81c Fix for flaky caffe2 dataio test (test_time_limit_reader_with_short_limit) (#27592)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27592

The caffe2 data reader test `test_time_limit_reader_with_short_limit` is flaky as-written because it places an upper bound on how much can be read, but under stress it is possible for fewer records to be read. The fix is to make the assertion check a fuzzy/range check rather than exact equality, since there's not a straightforward way to precisely test a timer-based feature.
ghstack-source-id: 91543898

Test Plan:
`buck test mode/dev-tsan //caffe2/caffe2/python:dataio_test-2.7 -- --stress-runs 20` -> P117156924 (with fix, 100% pass)

P117158750 - without fix, lots of failures in this test

Reviewed By: boryiingsu

Differential Revision: D17816775

fbshipit-source-id: 2ab0d3304fbd9c9806d37a4fe2912c840616db61
2019-10-10 13:53:58 -07:00
Jason Fried
b96f49885f caffe2 python ideep conv_op test_int8_convolution skip for python 3
Summary: This test was failing in 3.7,  turns out it was ommitted by test director in 3.6 so I added a skip for both versions

Test Plan: unittests is skipped in 3.7 and 3.6 all other tests pass.

Reviewed By: tomdz

Differential Revision: D17820967

fbshipit-source-id: 571f0ec7fe1b0cb50ead4e0d18c00151a701f36a
2019-10-08 21:31:11 -07:00
Lin Jiang
1f158adeee Add support for attention weight in SparseLookup (#26748)
Summary:
Support attention weights input to SparseLookup. In attention sum pooling, if attention weights can be pre-calculated before embedding lookup,  they can be passed to SparseLookup and processed by SparseLengthsWeightedSum op. One example is id_score attention sum pooling.

Essentially the net is converted from:
  LengthsSum(Mul(Gather(keys, w), att_weight))
to:
  SpaseLenghtsWeightedSum(keys, w, att_weight)

It unblocks potential efficiency gain with distributed training.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/26748

Test Plan: unit test

Reviewed By: chocjy

Differential Revision: D17553345

Pulled By: wheatkit

fbshipit-source-id: 60cc3c4b0bc1eade5459ac598e85286f3849a412
2019-10-08 20:22:25 -07:00
Swati Rallapalli
e63addfff6 Exponential decay of the weight of task loss (#27508)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27508

Implemented a simple exponential decay of the weight of lr loss function, with a lower bound.

Test Plan:
buck test //caffe2/caffe2/fb/dper/layer_models/tests:mtml_test -- test_task_weight_decay
https://our.intern.facebook.com/intern/testinfra/testrun/3377699729136308

canary: f140103452

Reviewed By: chenshouyuan

Differential Revision: D17524101

fbshipit-source-id: 9a653e21a4ecb74dfc4ac949c9e3388f36ef3a20
2019-10-08 09:15:41 -07:00
Kevin Chen
c2223df578 Implement LpNorm regularizer to be used on the inputs for feature importance (#26376)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26376

* Create the new dense_feature_reg (FCInputLpNorm) for feature importance to be applied to the fully-connected layer for feature-importance.

Test Plan: * Unit test located in: `caffe2/caffe2/fb/dper/layer_models/tests/split_1/sparse_nn_test.py`

Reviewed By: un-disclosed

Differential Revision: D17360361

fbshipit-source-id: 1a0e119eeb17199a13dfffe58b3036ea4255e301
2019-10-03 09:39:42 -07:00
Xing Wang
a1513dced3 Integrate FC fp16 exporter into Dper2 (#26582)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26582

Add the blob quantization.
replace the op in the eval/predictor net.

Test Plan:
# Unit test:

-----

buck build fblearner/flow/projects/dper/tests/validators:test_exporter_options_validators

./buck-out/gen/fblearner/flow/projects/dper/tests/validators/test_exporter_options_validators#binary.par

----

buck build caffe2/caffe2/fb/dper/layer_models/tests:exporter_test

./buck-out/gen/caffe2/caffe2/fb/dper/layer_models/tests/exporter_test-2.7#binary.par

Reviewed By: chocjy

Differential Revision: D17439720

fbshipit-source-id: 68de5d0322b0111aeca5ed552210bf80a4cddc78
2019-09-29 10:19:28 -07:00