Optimize LazyValues and SparseValues with Caching Mechanism by Phoenix8215 · Pull Request #4138 · NVIDIA/TensorRT

Phoenix8215 · 2024-09-21T10:10:02Z

Title: Optimize LazyValues and SparseValues with Caching Mechanism

Description:

This PR introduces a caching mechanism to the LazyValues and SparseValues classes in the ONNX GraphSurgeon module of TensorRT. By caching the loaded tensor values, we can avoid redundant data loading operations, improving performance, especially when dealing with large tensors or when the load method is called multiple times.

Motivation:

The load methods in both LazyValues and SparseValues classes currently reload tensor data every time they are called, which can be inefficient. Adding a simple caching mechanism ensures that tensor data is loaded once and reused, reducing computational overhead and improving the efficiency of the code.

Changes:

Added a _cached_values attribute to both classes to store the loaded tensor data.
Modified the load methods to check for cached data before loading.
Ensured that the caching mechanism is internal and does not affect external usage.

Code Changes:

diff --git a/onnx_graphsurgeon/ir/constants.py b/onnx_graphsurgeon/ir/constants.py
index abcdefg..hijklmn 100644
--- a/onnx_graphsurgeon/ir/constants.py
+++ b/onnx_graphsurgeon/ir/constants.py
@@ -1,6 +1,7 @@
 class LazyValues(object):
     """
     A special object that represents constant tensor values that should be lazily loaded.
+    Implements caching to optimize data loading.
     """
 
     def __init__(self, tensor):
@@ -16,6 +17,7 @@ class LazyValues(object):
             get_itemsize,
         )
 
+        self._cached_values = None  # Initialize the cache
         self.tensor = tensor
         self.shape = get_onnx_tensor_shape(self.tensor)
         self.dtype = get_onnx_tensor_dtype(self.tensor)
@@ -29,6 +31,9 @@ class LazyValues(object):
             np.array: A numpy array containing the values of the tensor.
         """
         import onnx
+        if self._cached_values is not None:
+            return self._cached_values  # Return cached data if available
+
         import onnx.numpy_helper
         from onnx_graphsurgeon.importers.onnx_importer import (
             get_dtype_name,
@@ -44,7 +49,8 @@ class LazyValues(object):
                 f"If this is not what you intended, please avoid accessing the values of this constant tensor."
             )
 
-        return np.array(onnx.numpy_helper.to_array(self.tensor))
+        self._cached_values = np.array(onnx.numpy_helper.to_array(self.tensor))
+        return self._cached_values
 
     def __str__(self):
         return "LazyValues (shape={:}, dtype={:})".format(self.shape, self.dtype)
@@ -55,12 +61,14 @@ class SparseValues(LazyValues):
     A special object that represents constant tensor values that is sparse
     """
 
+    def __init__(self, tensor):
+        super().__init__(tensor)
+        self._cached_values = None  # Initialize the cache
+
     def load(self):
         """
         Load a numpy array from the sparse structure.
 
         Returns:
-            np.array: A numpy array containing the values of the tensor.
+            np.array: A numpy array containing the values of the tensor, using cache.
         """
+        if self._cached_values is not None:
+            return self._cached_values  # Return cached data if available
 
         import onnx
         import onnx.numpy_helper
@@ -105,7 +113,9 @@ class SparseValues(LazyValues):
                 f"Unsupported index data dims {self.tensor.indices.dims} in {self.tensor.values.name}"
             )
 
-        return values
+        self._cached_values = values
+        return self._cached_values
 
     def __str__(self):
         return "SparseValues (shape={:}, dtype={:})".format(self.shape, self.dtype)

Testing:

Verified that the load method returns the correct tensor data on first and subsequent calls.
Ensured that the caching mechanism does not introduce any regressions or alter external behavior.
Tested with both dense and sparse tensors to confirm that the caching works as expected.

Request for Review:

Please review the proposed changes and let me know if there are any concerns or suggestions for improvement. I'm open to feedback and willing to make adjustments as needed.

pull 10.4

Signed-off-by: Phoenix <861062923@qq.com>

Phoenix8215 and others added 2 commits September 21, 2024 17:29

Merge pull request #2 from NVIDIA/release/10.4

c56f4c7

pull 10.4

Optimize LazyValues and SparseValues with Caching Mechanism

ff143f1

Signed-off-by: Phoenix <861062923@qq.com>

Phoenix8215 force-pushed the 10.4 branch from 10fd0ef to ff143f1 Compare September 21, 2024 10:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize LazyValues and SparseValues with Caching Mechanism#4138

Optimize LazyValues and SparseValues with Caching Mechanism#4138
Phoenix8215 wants to merge 2 commits intoNVIDIA:release/10.4from
Phoenix8215:10.4

Phoenix8215 commented Sep 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Phoenix8215 commented Sep 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant