tensorflow

mirror of https://github.com/zebrajr/tensorflow.git synced 2026-01-15 12:15:41 +00:00

Author	SHA1	Message	Date
Niklas Vangerow	dd10786acd	Migrate conditional_test to PjRt. PiperOrigin-RevId: 847911726	2025-12-22 16:03:25 -08:00
Subhankar Shah	69ea2a9308	Allow prefetching an hlo value if its use is colored in alternate memory even if if loop optimizer has decided otherwise. PiperOrigin-RevId: 847872344	2025-12-22 13:56:48 -08:00
Haibo Huang	21d80205a6	Add PJRT_Buffer_DonateWithControlDependency to the PJRT C API. PiperOrigin-RevId: 847868215	2025-12-22 13:40:01 -08:00
Niklas Vangerow	3480eee02b	Add HloModuleFromXlaComputation to HloRunnerAgnosticTestBase. Sometimes it is useful to turn an XlaComputation straight into a HloModule in a test. This is already functionality we basically support, but until now the computation had to be in the form of an XlaBuilder, which is not always practical. PiperOrigin-RevId: 847856677	2025-12-22 13:03:06 -08:00
Maxim Ermilov	b7650e843b	Add proto serialization for CollectivePermuteStartThunk PiperOrigin-RevId: 847846872	2025-12-22 12:27:26 -08:00
Dirk Hornung	678058948b	[Autotuner] Limit CuDNN tests to CuDNN autotuner backend. PiperOrigin-RevId: 847842272	2025-12-22 12:12:11 -08:00
Maxim Ermilov	1fa15367ad	Add proto serialization for RaggedAllToAllStartThunk PiperOrigin-RevId: 847830182	2025-12-22 11:37:45 -08:00
Byungchul Kim	fec780d7fe	Set FC's keep_num_dims to false when output dims is different from input dims after quantization. On gemma3n with decode batch > 1, it happens when the embedding is coupled with PLE by einsum. The export steps are: 1) Initial: BMM([b,2048]x[2048,7680] -> [b,7680]) 2) FuseInputReshape_BatchMatMulWithFlattenedRhsDims: BMM([b,2048]x[2048,7680] -> [b,7680]) 3) ConvertBatchMatMulOp2FullyConnectedOp_Rank2ConstantRhs: FC([b,2048]x[2048,7680] -> [b,7680]) 4) StrictQuantizationPattern(by IsDrqTensor): FC([b,1,2048]x[2048,7680] -> [b,7680]) When FC's keep_num_dims is false and it's followed by reshape op (like gemma3n), keep_num_dims will be set to true later with correct shapes by EnableFullyConnectedKeepNumDimsBeforeReshape. PiperOrigin-RevId: 847813526	2025-12-22 10:45:22 -08:00
Dirk Hornung	9ca49fcfa5	Limit CublasDot deterministic test to Cublas autotuning backend. PiperOrigin-RevId: 847803638	2025-12-22 10:18:21 -08:00
A. Unique TensorFlower	573bbe2b41	Migrates `builder.create<Op>()` => `Op::create()` in tablegen files PiperOrigin-RevId: 847796796	2025-12-22 09:54:11 -08:00
Oleg Shyshkov	3cec0d7b92	[XLA:GPU] Clean up RaggedAllToAllStartThunk rendezvous helpers. PiperOrigin-RevId: 847783200	2025-12-22 09:07:49 -08:00
A. Unique TensorFlower	af38f913d0	Automated Code Change PiperOrigin-RevId: 847756872	2025-12-22 07:39:52 -08:00
Dirk Hornung	3ea706cab3	Add --xla_gpu_experimental_autotune_backends to allow for selecting backends. This change for the new autotuner. The new autotuner with its Triton backend competes with cuDNN fusions leading to flaky tests. Also some tests disable some autotuning paths via --xla_gpu_cudnn_gemm_fusion_level or --xla_gpu_cublas_fallback which are not fully compatible with the new autotuner. Other tests rely on the order of the backends, which would be resolved by adding a backend selection mechanism. PiperOrigin-RevId: 847750954	2025-12-22 07:17:41 -08:00
Kanish Anand	4d0edd395f	Refactor `std::optional` comparison in `ReshapeSharding` tests PiperOrigin-RevId: 847749800	2025-12-22 07:07:40 -08:00
Henning Becker	12502acbf5	Remove unnecessary if_gpu_is_configured from Triton tests. The tests in xla/backends/gpu/codegen/triton/BUILD are already configured to run only on specific GPU backends, making the if_gpu_is_configured check on the srcs redundant. PiperOrigin-RevId: 847738574	2025-12-22 06:26:40 -08:00
Oleg Shyshkov	2f90852c17	[XLA:GPU] Remove TF_ prefix from RETURN_IF_ERROR and ASSIGN_OR_RETURN macros. PiperOrigin-RevId: 847716343	2025-12-22 04:59:16 -08:00
deeptanshusekhri	d0b7f40548	[tosa] : fixing dynamic batch handling in FullyConnected legalization (#106638 )	2025-12-22 04:10:39 -08:00
Dirk Hornung	f5b102299e	[Autotuner] Log autotuner config in readable json format. When debugging the autotuner we often want to know the values of the AutotuneConfig. PiperOrigin-RevId: 847683182	2025-12-22 03:01:32 -08:00
Henning Becker	23dd865ee5	Remove redundant TENSORFLOW_USE_ROCM define. The `TENSORFLOW_USE_ROCM=1` local define is no longer required for the `rocm_solver_context` target. PiperOrigin-RevId: 847677878	2025-12-22 02:50:30 -08:00
A. Unique TensorFlower	dfc5b243ca	Automated Code Change PiperOrigin-RevId: 847667783	2025-12-22 02:44:02 -08:00
Dirk Hornung	79af5068fd	[Autotuner] Avoid compiling all configurations if we only return the first one. This happens when we want to select the first configuration that successfuly compiles. E.g. for determinism. PiperOrigin-RevId: 847656341	2025-12-22 02:37:40 -08:00
A. Unique TensorFlower	d48869043b	compat: Update forward compatibility horizon to 2025-12-22 PiperOrigin-RevId: 847654748	2025-12-22 02:31:13 -08:00
A. Unique TensorFlower	14b51dd700	Update GraphDef version to 2449. PiperOrigin-RevId: 847654695	2025-12-22 02:14:02 -08:00
Dirk Hornung	85172d7831	[XLA:GPU] Shard the gpu_compiler_test. The _h100 test regularly causes timeouts. PiperOrigin-RevId: 847654247	2025-12-22 02:01:39 -08:00
A. Unique TensorFlower	37da2f6658	Automated Code Change PiperOrigin-RevId: 847648279	2025-12-22 01:41:30 -08:00
A. Unique TensorFlower	ec8a966f0d	Automated Code Change PiperOrigin-RevId: 847644299	2025-12-22 01:33:33 -08:00
A. Unique TensorFlower	2e5b1e44fc	Automated Code Change PiperOrigin-RevId: 847644164	2025-12-22 01:18:15 -08:00
A. Unique TensorFlower	53c2f78993	Automated Code Change PiperOrigin-RevId: 847643376	2025-12-22 01:10:46 -08:00
A. Unique TensorFlower	3c7c52e730	Automated Code Change PiperOrigin-RevId: 847641457	2025-12-22 01:02:55 -08:00
Dirk Hornung	6e5d62bf3e	Increase shards for fusion_emitter_device_test to speed up the test. PiperOrigin-RevId: 847632914	2025-12-22 00:51:12 -08:00
A. Unique TensorFlower	bb8c750b2f	Automated Code Change PiperOrigin-RevId: 847628658	2025-12-22 00:40:30 -08:00
Dirk Hornung	b1d2538541	[Autotuner] Initialize random input values for buffer checks. If values are initialized to 0 buffer checker will fail to detect backends with wrong results. PiperOrigin-RevId: 847627821	2025-12-22 00:31:57 -08:00
A. Unique TensorFlower	7b0d71c54e	Automated Code Change PiperOrigin-RevId: 847625245	2025-12-22 00:13:36 -08:00
A. Unique TensorFlower	64337a1e3c	Automated Code Change PiperOrigin-RevId: 847624939	2025-12-22 00:03:30 -08:00
A. Unique TensorFlower	6165d577f9	Automated Code Change PiperOrigin-RevId: 847622964	2025-12-21 23:49:29 -08:00
A. Unique TensorFlower	2b621d61f9	Automated Code Change PiperOrigin-RevId: 847622876	2025-12-21 23:38:53 -08:00
A. Unique TensorFlower	f4e53263b1	Automated Code Change PiperOrigin-RevId: 847622680	2025-12-21 23:23:39 -08:00
A. Unique TensorFlower	e3e3bc1946	Reverts `c549ee47f8` PiperOrigin-RevId: 847535506	2025-12-21 18:14:27 -08:00
Junwhan Ahn	7733c4c03d	Use `StartDetachedThread` instead of `SchedClosure` to dispatch atom program compilation PiperOrigin-RevId: 847528854	2025-12-21 17:33:37 -08:00
A. Unique TensorFlower	ce61030c67	Automated Code Change PiperOrigin-RevId: 847480750	2025-12-21 13:26:51 -08:00
A. Unique TensorFlower	0cfba6c852	Automated Code Change PiperOrigin-RevId: 847414785	2025-12-21 08:39:12 -08:00
A. Unique TensorFlower	f356a762f3	Automated Code Change PiperOrigin-RevId: 847414761	2025-12-21 08:15:05 -08:00
A. Unique TensorFlower	630698a3af	Automated Code Change PiperOrigin-RevId: 847414309	2025-12-21 07:40:13 -08:00
A. Unique TensorFlower	f253afed70	Automated Code Change PiperOrigin-RevId: 847412849	2025-12-21 07:23:29 -08:00
Kanish Anand	5042531aa8	Moving definitions to cpp file, match function definition declaration order PiperOrigin-RevId: 847385799	2025-12-21 05:00:39 -08:00
A. Unique TensorFlower	a27a856d1c	Automated Code Change PiperOrigin-RevId: 847361371	2025-12-21 02:58:27 -08:00
A. Unique TensorFlower	ffb02301df	Update GraphDef version to 2448. PiperOrigin-RevId: 847339150	2025-12-21 01:32:04 -08:00
A. Unique TensorFlower	e60b3eb362	compat: Update forward compatibility horizon to 2025-12-21 PiperOrigin-RevId: 847339112	2025-12-21 01:18:45 -08:00
Bhupendra Dubey	ff7eb222c2	Refactor XLA Profiler State Check to Use Low-Overhead C API This CL refactors the XLA profiler's state-checking mechanism to resolve GIL deadlocks and improve performance. Previously, the C++ profiler context would import a Python module to update the profiler's state. This operation, performed while holding the GIL, could cause deadlocks if the import failed (e.g., in a JAX-only environment). This change replaces the fragile cross-language import with a shared C++ std::atomic<bool>. Python code now queries this state via a new, low-overhead C function (is_traceme_enabled_raw) instead of ctypes. This approach eliminates the deadlocks, decouples the C++ profiler from Python modules, and maintains high performance for the state check. The internal C++ API was also updated to use a safer reference instead of a raw pointer. PiperOrigin-RevId: 847261952	2025-12-20 19:56:52 -08:00
A. Unique TensorFlower	580eeae4c3	Automated Code Change PiperOrigin-RevId: 847190483	2025-12-20 16:29:24 -08:00

1 2 3 4 5 ...

188611 Commits