This is an automatically translated post by LLM. The original post is in Chinese.
If you find any translation errors, please leave a comment to help me
improve the translation. Thanks!
System: Translate the following text into English:
This is an automatically translated post by LLM. The original post is in Chinese.
If you find any translation errors, please leave a comment to help me
improve the translation. Thanks!
Introduction
Sometimes, life brings us surprises. It could be a beautiful sunset
we stumble upon while skipping class, or a blooming flower we wake up
to. In this ever-progressing industrial era, we are all involuntarily
caught up in the hustle and bustle. This is the deepest realization I've
had since I entered graduate school and gained a deeper understanding of
the world.
This is an automatically translated post by LLM. The original post is in Chinese.
If you find any translation errors, please leave a comment to help us
improve the translation. Thanks!
This is an automatically translated post by LLM. The original post is in Chinese.
If you find any translation errors, please leave a comment to help me
improve the translation. Thanks!
Python High Performance
Programming
Performance Analysis
Time Analysis
Use decorators to automatically measure time:
1 2 3 4 5 6 7 8 9 10 11 12
from functools import wraps import time
deftimefn(fn): @wraps(fn) defmeasure_time(*args, **kwargs): t1=time.time() result=fn(*args, **kwargs) t2=time.time() print("@timefn:"+fn.__name__+" took "+str(t2-t1)+" seconds") return result return measure_time
In IPython, you can use the %timeit magic function to
time function runs:
1
%timeit <function>
Use the UNIX time command to time:
1
/usr/bin/time -p (--verbose) python <script-name>
Use the cProfile module:
cProfile is a built-in analysis tool in the standard
library. It hooks into CPython's virtual machine to measure the time it
takes for each function to run. Add the @profile decorator
before the function to be analyzed, and run the following command for
performance analysis:
Add the @profile decorator before the function to be
analyzed, and run the following command for performance analysis:
1
kernprof -l -v <script-name>
Space Analysis
Use memory_profiler to analyze memory usage:
Install: pip install memory_profiler
Add the @profile decorator before the function to be
analyzed, and run the following command for performance analysis:
1
python -m memory_profiler <script-name>
Use mprof to plot memory usage. This can be done in two
steps:
Run the program using mprof, which will generate an
mprof*.dat file in the directory:
1
mprof run <script-name>
Plot the graph:
1
mprof plot
Lists and Tuples
Data structures such as lists and tuples are called arrays, which are
flat lists of data in some intrinsic order.
The main differences between lists and tuples are as follows:
Lists are dynamic arrays that are mutable and can be resized. Tuples
are static arrays that are immutable and cannot be changed once
created.
Tuples are cached in the Python runtime environment, which means
that there is no need to access the kernel to allocate memory every time
a tuple is used.
Lists require more memory overhead than tuples.
Lists
The query operation for lists is O(1).
The default search operation for lists (list.index()) is
a linear operation, with a worst-case time complexity of O(n). Sorting
the list can reduce the search time to O(log n) (using binary
search).
Python lists have a built-in sorting algorithm that uses Tim sort.
Tim sort sorts with a complexity of O(n) in the best case (O(n log n) in
the worst case). It mixes insertion sort and merge sort to achieve this
performance (guessing which algorithm is better for a given data set
using a probing method).
The bisect module can keep the order of the list
unchanged (default from small to large) when inserting into the list.
Some documentation for bisect is as follows[2]:
When writing time-sensitive code using bisect() and
insort(), keep the following concepts in mind.
Binary search is efficient for searching a certain range of values.
For locating a specific value, dictionary performance is better.
The time complexity of the insort() function is O(n)
because the logarithmic search steps are dominated by the linear
insertion steps.
These search functions are stateless and discard the results of the
key function after they are used. Therefore, if the search function is
used in a loop, the key function may be called repeatedly on the same
data element. If the key function is not fast, consider wrapping it with
functools.cache()
to avoid redundant calculations. Alternatively, consider searching a
precomputed key array to locate the insertion point (as demonstrated in
the example section below).
When allocating memory for a list, a certain amount of reserved space
is added. When a list of size N needs to add new elements, Python
creates a new list that is large enough to hold the original N elements
plus the additional elements. In actual allocation, the size allocated
is not N+1, but N+M, where M is calculated as follows:
This is an automatically translated post by LLM. The original post is in Chinese.
If you find any translation errors, please leave a comment to help me
improve the translation. Thanks!