Projekt

Obecné

Profil

Akce

Time inefficiencies

ARAP

1) In the constructor of the arap class, calculating m_solver.compute(m_transpose * m_transpose.transpose()) could be time consuming if the matrix is large. This computation is performed whenever an instance of the arap class is created, but it does not depend on any input parameters. If the m_transpose matrix is constant for all instances, this calculation can be moved out of the constructor and performed only once.

2) In the arap::difference_matrix method, the for (auto edge : *edges) loop iterates over all edges, but uses the operation ret.row(idx++) = this_pos - Eigen::Vector3d(get_position(edge.end));, which could be very time-consuming if the number of edges is large. One improvement could be to pre-allocate the ret matrix with the correct size before looping and then assign values directly to the rows using indexing, for example.

3) In the arap::dynamic_right_side and arap::fixed_variables methods, the for loop (auto edge : *edges) iterates over all edges for each vertex. Instead of reloading the edges for each vertex, it might be more efficient to load the edges once outside the loop and reuse them inside the loop.

PBD

1) In the pbd::init method, the scenario and attachment arrays are used to map the scenario and attachment strategy indices to the corresponding member function indices. However, these arrays are created and initialized each time the init method is called. For efficiency, these fields can be moved outside the method and declared as static or global variables to avoid unnecessary creation and initialization.

2) The pbd::get_graph method simply returns the member variable m_graph. If the get_graph method is called frequently, it would be more efficient to access the member variable directly rather than creating a method.

3) In the pbd::execute method, there is a loop for (size_t a = m_start; a < m_end; a++) that iterates through the muscle nodes. Inside the loop, the force applied to each node is set using double* f = m_graph->node(a).get_force();. This involves retrieving a force pointer for each node in the graph, which could be inefficient -- if retrieving the force involves complex computations or memory access patterns.

4) In the pbd::execute method, the grav variable is calculated using double\ grav = std::pow(10, m_gravity) - 1;. This calculation involves a power function and subtraction. If the value of m_gravity remains constant during the execution of the execute method -- precomputing this value outside the loop might be more time efficient.

5) In the laplacianSmoothing function -- The code allocates a new smoothed_pos array for each vertex in the muscle model. This allocation is done inside an inner loop, which can be expensive in terms of memory allocation and deallocation. It would be more efficient to allocate the array once before looping and reuse it.

6) The applyLaplacianSmoothing function is called for each vertex inside the outer loop. Instead of applying smoothing to each vertex separately, it would be more efficient to collect the smoothed positions for all vertices and then update them together outside the loop.

There are several loops in the PBD class where parallelization can potentially be used to increase performance. These loops include iterations over vertices, collisions, muscle models, and triangles. By parallelizing these loops, multiple iterations can be executed simultaneously, leveraging the power of multi-core processors and potentially reducing overall execution time. Using parallelization techniques, these loops can be executed in parallel, potentially increasing performance and reducing overall computation time. However, when parallelizing code, it is important to ensure proper synchronization and data sharing mechanisms to avoid race conditions and maintain data consistency.

Graph

1) In the constructor of the graph class, memory is allocated for different fields (m_arr, m_collision, m_volume, m_polys, m_poly_neighbours, m_normals). However, memory allocation is done using the new operator, which can be slow for large blocks of memory. Better results would be possible using more efficient memory allocation techniques such as std::vector or memory pooling.

2) Volume calculation -- The calc_volume method calculates the volume of the model by iterating over all triangles and performing vector calculations. However, the implementation could be optimized by using a more efficient volume calculation algorithm, such as Shoelace's formula or the divergence theorem.

3) The generate_collision_detection function checks whether m_collision[part] is null and allocates memory using new if necessary. Using an intelligent pointer (e.g. std::unique_ptr) to automatically manage memory allocation and deallocation might be more efficient. Additionally, it can help avoid memory leaks and improve code safety.

4) Using a vector function instead of regular loops could also lead to better performance: the update_normal function performs calculations over individual array elements using regular for loops. To perform these calculations more efficiently, you could use a function from the <algorithm> or <numeric> library, such as std::transform.

5) The code uses arrays and vectors to store data. However, using more appropriate data structures such as std::unordered_map, std::unordered_set or std::set might be more time efficient. These data structures can provide faster lookups and eliminate the need for manual searching or hashing.

6) The log_of_int function can be simplified and optimized. Instead of looping and shifting to the right, the std::log2 function can be used to calculate the binary logarithm.

7) In the update_stiffness function, the get_edges function is called repeatedly in a loop for each node. It would be more efficient to call it once before looping and store the result in a local variable.

Aktualizováno uživatelem Jan Krajňák před téměř 2 roky(ů) · 8 revizí