# Python High Performance Programming

This is an automatically translated post by LLM. The original post is in Chinese. If you find any translation errors, please leave a comment to help me improve the translation. Thanks!

# Python High Performance Programming

## Performance Analysis

### Time Analysis

Use decorators to automatically measure time:

1

2

3

4

5

6

7

8

9

10

11

12from functools import wraps

import time

def timefn(fn):

def measure_time(*args, **kwargs):

t1=time.time()

result=fn(*args, **kwargs)

t2=time.time()

print("@timefn:"+fn.__name__+" took "+str(t2-t1)+" seconds")

return result

return measure_timeIn IPython, you can use the

`%timeit`

magic function to time function runs:1

%timeit <function>

Use the UNIX

`time`

command to time:1

/usr/bin/time -p (--verbose) python <script-name>

Use the

`cProfile`

module:`cProfile`

is a built-in analysis tool in the standard library. It hooks into CPython's virtual machine to measure the time it takes for each function to run. Add the`@profile`

decorator before the function to be analyzed, and run the following command for performance analysis:1

python -m cProfile -s cumulative <script-name> <options>

You can also save the analysis results to a statistics file and use other tools for analysis:

1

python -m cProfile -o <outputfilename> <script-name> <options>

Use

`line_profiler`

for line-by-line analysis:Install:

`pip install line_profiler`

Add the

`@profile`

decorator before the function to be analyzed, and run the following command for performance analysis:1

kernprof -l -v <script-name>

### Space Analysis

Use

`memory_profiler`

to analyze memory usage:Install:

`pip install memory_profiler`

Add the

`@profile`

decorator before the function to be analyzed, and run the following command for performance analysis:1

python -m memory_profiler <script-name>

Use

`mprof`

to plot memory usage. This can be done in two steps:Run the program using

`mprof`

, which will generate an`mprof*.dat`

file in the directory:1

mprof run <script-name>

Plot the graph:

1

mprof plot

## Lists and Tuples

Data structures such as lists and tuples are called arrays, which are flat lists of data in some intrinsic order.

The main differences between lists and tuples are as follows:

- Lists are dynamic arrays that are mutable and can be resized. Tuples are static arrays that are immutable and cannot be changed once created.
- Tuples are cached in the Python runtime environment, which means that there is no need to access the kernel to allocate memory every time a tuple is used.
- Lists require more memory overhead than tuples.

### Lists

The query operation for lists is O(1).

The default search operation for lists (`list.index()`

) is
a linear operation, with a worst-case time complexity of O(n). Sorting
the list can reduce the search time to O(log n) (using binary
search).

Python lists have a built-in sorting algorithm that uses Tim sort. Tim sort sorts with a complexity of O(n) in the best case (O(n log n) in the worst case). It mixes insertion sort and merge sort to achieve this performance (guessing which algorithm is better for a given data set using a probing method).

The `bisect`

module can keep the order of the list
unchanged (default from small to large) when inserting into the list.
Some documentation for `bisect`

is as follows[2]:

`bisect.bisect_left(a, x, lo=0, hi=len(a), *, key=None)`

Find the leftmost point of insertion for x in list a

`bisect.bisect_right(a, x, lo=0, hi=len(a), *, key=None)`

Find the rightmost point of insertion for x in list a

`bisect.bisect(a, x, lo=0, hi=len(a), *, key=None)`

Same as

`bisect.bisect_left`

`bisect.insort_left(a, x, lo=0, hi=len(a), *, key=None)`

Insert x into list a in sorted order, with the insertion point determined by

`bisect.bisect_left`

`bisect.insort_right(a, x, lo=0, hi=len(a), *, key=None)`

Insert x into list a in sorted order, with the insertion point determined by

`bisect.bisect_right`

`bisect.insort(a, x, lo=0, hi=len(a), ***, key=None)`

Same as

`bisect.insort_right`

Performance NotesWhen writing time-sensitive code using

`bisect()`

and`insort()`

, keep the following concepts in mind.

- Binary search is efficient for searching a certain range of values. For locating a specific value, dictionary performance is better.
- The time complexity of the
`insort()`

function is O(n) because the logarithmic search steps are dominated by the linear insertion steps.- These search functions are stateless and discard the results of the key function after they are used. Therefore, if the search function is used in a loop, the key function may be called repeatedly on the same data element. If the key function is not fast, consider wrapping it with
`functools.cache()`

to avoid redundant calculations. Alternatively, consider searching a precomputed key array to locate the insertion point (as demonstrated in the example section below).

When allocating memory for a list, a certain amount of reserved space is added. When a list of size N needs to add new elements, Python creates a new list that is large enough to hold the original N elements plus the additional elements. In actual allocation, the size allocated is not N+1, but N+M, where M is calculated as follows:

M = (N >> 3) + (N < 9 ? 3 : 6) + 1

N | 0 | 1-4 | 5-8 | 9-16 | 17-25 | 26-35 | ... | 991-1120 |
---|---|---|---|---|---|---|---|---|

M | 0 | 1 | 2 | 3 | 4 | 5 | ... | 140 |