This is an automatically translated post by LLM. The original post is in Chinese. If you find any translation errors, please leave a comment to help me improve the translation. Thanks!

Overview of BP Neural Network

The BP neural network is the most basic and widely used neural network in the field. It consists of three layers of nodes: the input layer, the hidden layer, and the output layer. The implementation of the BP neural network is relatively simple, mainly divided into two parts: forward propagation and backward propagation of errors.

The universal approximation theorem of neural networks states that a one-dimensional step function can approximate any one-dimensional continuous function, and a sigmoid function can approximate a step function. Therefore, a linear combination of one-dimensional sigmoid functions can approximate any continuous function. This provides a theoretical basis for the application of neural networks.

The advantage of neural networks lies in the fact that many complex function mappings that are difficult to solve can be obtained by combining multiple one-dimensional step functions. The main problem and difficulty in building a neural network is how to combine these one-dimensional functions.

Understanding BP Neural Network

The BP neural network can be seen as a multi-input multi-output function. If we ignore its internal structure, it can be represented as a black box model:

Black box model of BP neural network

In this BP neural network, there are \(m\) inputs and \(n\) outputs. We know that there should be a hidden layer between the input and output layers. So how many nodes should be in the hidden layer? Generally, the determination of the hidden layer is determined by the following empirical formula: \[ h=\sqrt{m+n}+a \] where \(h\) is the number of nodes in the hidden layer, \(m\) is the number of nodes in the input layer, \(n\) is the number of nodes in the output layer, and \(a\) is an adjustment constant.

Based on the number of input and output nodes, we can construct a simple BP neural network model. Its internal structure is as follows (taking \(m=3\), \(n=3\), and \(h=3\) as an example):

Internal structure of a three-layer neural network

With such a three-layer neural network, any 3D-to-3D mapping can be achieved through the combination of one-dimensional functions. So how to establish this mapping? This problem is actually how to train the BP neural network. The training process mainly consists of two parts: forward propagation of results and backward propagation of residuals.

Forward Propagation

For any node in the BP neural network, its input is the weighted sum of the outputs of the previous layer nodes. Taking the hidden layer as an example, let the output of the input layer node be \(x_i\), the input of the hidden layer node be \(net_j\), the weight connecting node \(i\) in the input layer to node \(j\) in the hidden layer be \(w_{ij}\), and the constant term be \(b_j\). Then the input of the hidden layer node is: \[ net_j=\sum_{i=1}^m w_{ij}x_i+b_j \] In the BP neural network, in order to ensure that the activation function is differentiable everywhere, the sigmoid function is used as the activation function. The output of the node is: \[ f(net_j)=\frac1{1+e^{-net_j}} \]

Advantages of using sigmoid function:

  1. Compared to the step function, it is differentiable everywhere in its domain.
  2. Let \(y=sigmoid(x)\), then \(y'=y(1-y)\). It can be seen that the derivative of the sigmoid function can be represented using itself. Once the value of the sigmoid function is calculated, it is very convenient to calculate the value of its derivative. This provides convenience for using gradient descent in backpropagation.

Main disadvantages of the sigmoid function:

  1. Vanishing gradient: Note that when the sigmoid function approaches 0 or 1, the rate of change becomes flat, which means that the gradient of the sigmoid tends to 0. Neurons in the network that use the sigmoid activation function and have outputs close to 0 or 1 are called saturated neurons. Therefore, the weights of these neurons will not be updated. In addition, the weights connected to these neurons will also be updated slowly. This problem is called the vanishing gradient problem. Therefore, imagine that if a large neural network contains sigmoid neurons, and many of them are in a saturated state, the network cannot perform backpropagation.

  2. Not zero-centered: The output of the sigmoid is not zero-centered.

  3. High computational cost: The exp() function has a higher computational cost compared to other nonlinear activation functions.

Sigmoid function

Each neuron performs this independent calculation, so for a set of inputs, the neural network can perform calculations to obtain the corresponding outputs. This is the process of forward propagation.

Backward Propagation

At the beginning, all the weights in the system are randomly determined. Therefore, in order to make the model tend to the desired result through learning training data, the weights in the nodes need to be continuously adjusted. The basic algorithm idea of backward propagation is the gradient descent algorithm in nonlinear programming, and the goal of the programming is to minimize the loss function. The general process is as follows:

  • Set the loss function. Assuming that all the results of the output layer are \(d_j\), the loss function is as follows:

\[ E(w,b)=\frac12\sum_{j=0}^{n-1}(d_j-y_j)^2 \]

  • Modify the \(w\) and \(b\) from the hidden layer to the output layer through the loss function. For the weight \(w_{ij}\) from the hidden layer node \(i\) to the output layer node \(j\), the modification is as follows (where \(\eta\) is the learning rate):

\[ \Delta w=-\eta\frac{\partial E}{\partial w_{ij}} \]

  • Similarly, the modification for \(b\) is:

\[ \Delta b=-\eta\frac{\partial E}{\partial b_{i}} \]

This is basically the idea. The process of calculating partial derivatives is quite complex, so I won't go into detail here. Just remember the idea of using gradient descent to minimize the loss function.

References

[1] ACdreamers. BP神经网络[G/OL]. CSDN: 2015.03.26[2020.04.22]. https://blog.csdn.net/acdreamers/article/details/44657439

[2] 东皇Amrzs. [整理] BP神经网络讲解——最好的版本[G/OL]. 简书: 2017.02.28[2020.04.22]. https://www.jianshu.com/p/3d96dbf3f764

[3] lx青萍之末. BP神经网络[G]. CSDN: 2018.07.20[2020.04.22]. https://blog.csdn.net/daaikuaichuan/article/details/81135802.

高通滤波器经常用来增强图像和提取图像的边缘信息,在日常的图像处理和图像识别中都有着很广泛的应用。这里要说明的高通滤波器主要有如下几种:非锐化掩膜,索贝尔算子,拉普拉斯算子,canny算法。

阅读全文 »

This is an automatically translated post by LLM. The original post is in Chinese. If you find any translation errors, please leave a comment to help me improve the translation. Thanks!

High-pass filters are often used to enhance images and extract edge information. They have a wide range of applications in image processing and image recognition. The main types of high-pass filters are as follows: unsharp mask, Sobel operator, Laplacian operator, and Canny algorithm.

Unsharp Mask

This algorithm is often used for image enhancement and is not commonly used for edge extraction. The implementation method is as follows:

  • For the original image \(f(x,y)\), apply Gaussian blur to obtain the smoothed image \(\overline{f(x,y)}\).
  • Take the difference between the original image and the smoothed image to obtain the mask \(g_{mask}(x,y)=f(x,y)-\overline{f(x,y)}\).
  • Add the original image and k times the mask to obtain the enhanced image \(g(x,y)=f(x,y)+k*g_{mask}(x,y)\).
  • Generally, the value of k is set to 1 for image enhancement. If k is greater than 1, it is a high-boost image.

The image processing effect of this algorithm is as follows:

From left to right: original image, obtained mask (normalized to 0-255), enhanced image

From the results in the above figure, it can be seen that the mask image has high brightness at the edges and low brightness elsewhere. After overlaying with the original image, the brightness of the original image's edges becomes larger. In this case, the method used here is to normalize the overlaid image as a whole and then multiply it by 255 to obtain an integer value. Because the brightness of the original image's edges is large after overlaying with the mask, the bright parts in the original image become darker after normalization. If another method is used, such as treating values greater than 255 as 255 and values less than 0 as 0, it can effectively solve this problem, but the edges of the image will not be as obvious as the previous method. The specific method to be used depends on the actual situation.

Sobel Operator

The Sobel operator is often used for edge extraction in images. It uses the first-order derivative of the image to extract edges, which is denoted as \(\nabla f\).

We know that the gradient of a binary function consists of the directional derivatives in the x and y directions. The Sobel operator is the same, with two operators for the x and y directions, respectively, as follows: \[ \begin{bmatrix}-1&0&1\\-1&0&2\\-1&0&1\end{bmatrix} ~~~~~~~~~~\begin{bmatrix}-1&-2&-1\\0&0&0\\1&2&1\end{bmatrix} \]

  • Convolve the image with the above two operators separately to obtain the x-direction and y-direction edges \(G_x\) and \(G_y\).
  • The first-order derivative of the image \(G\) can be calculated using \(G_x\) and \(G_y\), which is \(G=\sqrt{G_x^2+G_y^2}\). Sometimes, for fast calculation, \(G=|G_x|+|G_y|\) or \(G=max\{|G_x|,|G_y|\}\) can be used.
  • Normalize G to the range of 0-255 to obtain the Sobel edges of the image.

From left to right: original image, x-direction edges, y-direction edges, Sobel edges

The Sobel operator performs well in extracting obvious edges in images and also performs well in extracting detailed edges in images. This can be illustrated by the processing results of another image:

It can be seen that even the small edge contours on the chimney are well represented by the edges extracted by the Sobel operator.

Laplacian Operator

\[ [f(x+1)-f(x)]-[f(x)-f(x-1)]=f(x+1)+f(x-1)-2f(x) \] From this, the Laplacian operator can be derived as: \[ \begin{bmatrix}0&1&0\\1&-4&1\\0&1&0\end{bmatrix} \] Sometimes, the second-order derivative in the diagonal direction is also added, and the operator becomes: \[ \begin{bmatrix}1&1&1\\1&-8&1\\1&1&1\end{bmatrix} \] In practice, the following two operators are often used: \[ \begin{bmatrix}0&-1&0\\-1&4&-1\\0&-1&0\end{bmatrix} \begin{bmatrix}-1&-1&-1\\-1&8&-1\\-1&-1&-1\end{bmatrix} \]

Assuming that the extracted image edges are \(L(x,y)\), the algorithm for image enhancement is \(g(x,y)=f(x,y)+c*L(x,y)\)

If the above two operators are used to extract the edges, c=-1; if the following two operators are used, c=1.

The image edge information extracted using the Laplacian operator is as follows:

From left to right: original image, Laplacian edges without diagonal, Laplacian edges with diagonal

The edges extracted by the Laplacian operator have two edge lines at each edge, which is determined by the properties of its second-order derivative. Compared with the edges extracted by the Sobel operator, the Laplacian edges are more detailed and capture the "edges of edges". This can be seen more clearly from the comparison in the following figure:

The left side is the result of the Sobel operator, and the right side is the result of the Laplacian operator

For slightly more complex images, the edges extracted by the Laplacian operator are too detailed, making it difficult to see many areas clearly, which poses some difficulties for human visual perception. However, in some image recognition fields, such as using satellite images to identify ground vehicles, the vehicles on the ground are often small color blocks, and the Laplacian operator can well outline the edge contours and some details inside these vehicles. The Sobel operator is not as effective in capturing these internal details. At the same time, the property of "edges of edges" makes the edges extracted by the Laplacian operator suitable for image enhancement.

Canny Algorithm

The Canny algorithm is an optimization of the Sobel edge extraction. The Sobel operator represents all edges in the final image, regardless of their strength. This results in many invalid edges being extracted and displayed in the resulting image. The basic idea of the Canny algorithm is to filter out these edge information and only keep the pixels that are most likely to be edges. The implementation method is as follows:

  • Similar to the Sobel operator, calculate \(G_x\) and \(G_y\).
  • Calculate the weight \(weight=\sqrt{G_x^2+G_y^2}\) and the angle \(angle=atan\frac{G_y}{G_x}\) for each pixel based on \(G_x\) and \(G_y\).
  • Discretize the angle to the nearest multiple of \(45^o\).
  • For each pixel \((x,y)\) in the image, compare its weight \(weight(x,y)\) with the weights of the two pixels in the \(angle(x,y)\) direction and \(-angle(x,y)\) direction. If the weight of the pixel is not the largest, set it to 0.
  • Double threshold detection: set upper and lower thresholds for the brightness of the edges, and perform another filtering on the edges of the image. Finally, normalize the edge information to the range of 0-255. The thresholds can be manually set.

From left to right: original image, edges without double threshold detection, edges with lower threshold of 150, edges with lower threshold of 200

Appendix

References

[1] Rafael C. Gonzalez and Richard E. Woods, "Digital Image Processing", 3rd Edition, Beijing: Publishing House of Electronics Industry, 2017.

[2] Brook_icv, "Image Processing Basics (4): Gaussian Filter Detailed Explanation" [Online]. Available: https://www.cnblogs.com/wangguchangqing/p/6407717.html#autoid-4-1-0

[3] Naughty Stone 7788121, "Image Edge Detection: Canny Operator, Prewitt Operator, and Sobel Operator" [Online]. Available: https://www.jianshu.com/p/bed4ffe996a1

Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
import cv2 as cv
import numpy as np
import math

sigma=1.5 # Parameter of the Gaussian filter

def add_zeros(img,edge): # Add zeros to the edges of the image
shape=img.shape
temp=np.zeros((shape[0]+2*edge,shape[1]+2*edge))
for i in range(shape[0]):
for j in range(shape[1]):
temp[i+edge][j+edge]=img[i][j][0]
return temp

def f(x,y): # Define the 2D Gaussian distribution function
return 1/(math.pi*sigma**2)*math.exp(-(x**2+y**2)/(2*sigma**2))

def gauss(n): # Generate an n*n Gaussian filter
mid=n//2
filt=np.zeros((n,n))
for i in range(n):
for j in range(n):
filt[i][j]=f(i-mid,j-mid)/f(-mid,-mid)
return filt.astype(np.uint8)

def gauss_filter(img,n): # Apply n*n convolutional blocks of Gaussian filtering to the image
filt=gauss(n)
con=1/np.sum(filt)
shape=img.shape
temp=add_zeros(img,n//2)
result=np.zeros((shape[0],shape[1],1))
for i in range(shape[0]):
for j in range(shape[1]):
tmp=0
for k in range(n):
for l in range(n):
tmp+=filt[k][l]*temp[i+k][j+l]
result[i][j][0]=con*tmp
return result.astype(np.uint8)

def unsharp_mask(img,n,is_mask=0): # Unsharp mask using n*n Gaussian blur, return the mask if is_mask=1
shape=img.shape
new_img=np.zeros((shape[0],shape[1],1))
for i in range(shape[0]):
for j in range(shape[1]):
new_img[i][j][0]=img[i][j][0]
mask=new_img-gauss_filter(img,n)
for i in range(shape[0]):
for j in range(shape[1]):
if i==0 or j==0 or i==shape[0]-1 or j==shape[1]-1:
mask[i][j][0]=0
result=new_img+mask
result=result-np.min(result)
result=result/np.max(result)*255
mask=mask-np.min(mask)
mask=mask/np.max(mask)*255
if is_mask:
return mask.astype(np.uint8)
return result.astype(np.uint8)

sobelx=[[-1,0,1],[-2,0,2],[-1,0,1]]
sobely=[[-1,-2,-1],[0,0,0],[1,2,1]]
laplace4=[[0,-1,0],[-1,4,-1],[0,-1,0]]
laplace8=[[-1,-1,-1],[-1,8,-1],[-1,-1,-1]]

def filt_3(img,filt): # Arbitrary 3*3 filter (image, operator)
shape=img.shape
temp=add_zeros(img,1)
result=np.zeros((shape[0],shape[1],1))
for i in range(shape[0]):
for j in range(shape[1]):
tmp=0
for k in range(3):
for l in range(3):
tmp+=filt[k][l]*temp[i+k][j+l]
result[i][j][0]=tmp
return result

def laplace_edge(img,filt): # Laplacian edges after normalization
tmp=filt_3(img,filt)
tmp=tmp-np.min(tmp)
shape=tmp.shape
for i in range(shape[0]):
for j in range (shape[1]):
if i==0 or j==0 or i==shape[0]-1 or j==shape[1]-1:
tmp[i][j][0]=0
tmp=tmp/np.max(tmp)*255
return tmp.astype(np.uint8)

def laplace(img,filt): # Overlay of the original image and the Laplacian edges
tmp=filt_3(img,filt)
shape=img.shape
result=np.zeros((shape[0],shape[1]))
for i in range(shape[0]):
for j in range(shape[1]):
result[i][j]=tmp[i][j][0]+img[i][j][0]
if i==0 or j==0 or i==shape[0]-1 or j==shape[1]-1:
result[i][j]=0
result-=np.min(result)
result=result/np.max(result)*255
return result.astype(np.uint8)

def sobel(img): # Extract Sobel edges of the image
shape=img.shape
sobx=filt_3(img,sobelx)
soby=filt_3(img,sobely)
result=np.zeros((shape[0],shape[1]))
for i in range(shape[0]):
for j in range(shape[1]):
if i==0 or j==0 or i==shape[0]-1 or j==shape[1]-1:
result[i][j]=0
else:
result[i][j]=math.sqrt(sobx[i][j][0]**2+soby[i][j][0]**2)
result=result/np.max(result)*255
return result.astype(np.uint8)

def canny(img,n=3): # Extract image edges using the Canny algorithm (blur operation using an n*n Gaussian filter)
de=[[1,0,-1,0],[1,1,-1,-1],[0,1,0,-1],[-1,1,1,-1]]
shape=img.shape
tmp=gauss_filter(img,n)
sobx=filt_3(tmp,sobelx)
soby=filt_3(tmp,sobely)
weight,angle,result=np.zeros((shape[0],shape[1])),np.zeros((shape[0],shape[1])),np.zeros((shape[0],shape[1]))
angle=angle.astype(np.int)
for i in range(shape[0]):
for j in range(shape[1]):
weight[i][j]=math.sqrt(sobx[i][j][0]**2+soby[i][j][0]**2)
if sobx[i][j][0]:
angle[i][j]=round((math.atan(soby[i][j][0]/sobx[i][j][0])/(math.pi/4)-0.5))%4
for i in range(shape[0]-2):
for j in range(shape[1]-2):
tmp_i,tmp_j=i+1,j+1
if weight[tmp_i][tmp_j]<=weight[tmp_i+de[angle[tmp_i][tmp_j]][0]][tmp_j+de[angle[tmp_i][tmp_j]][1]] and weight[tmp_i][tmp_j]<=weight[tmp_i+de[angle[tmp_i][tmp_j]][2]][tmp_j+de[angle[tmp_i][tmp_j]][3]]:
result[tmp_i][tmp_j]=0
else:
result[tmp_i][tmp_j]=weight[tmp_i][tmp_j]
result=result/np.max(result)*255
mean=np.mean(img)
for i in range(shape[0]):
for j in range(shape[1]):
if result[i][j]<100:
result[i][j]=0
return result.astype(np.uint8)


filename=["test3_corrupt.pgm","test4.tif"]
for i in filename:
img=cv.imread(i)
cv.imwrite(i+"_mask.bmp",unsharp_mask(img,3,1))
cv.imwrite(i+"_unsharp_mask.bmp",unsharp_mask(img,3))
cv.imwrite(i+"_sobel.bmp",sobel(img))
cv.imwrite(i+"_canny.bmp",canny(img,3))
cv.imwrite(i+"laplace4_edge.bmp",laplace_edge(img,laplace4))
cv.imwrite(i+"laplace8_edge.bmp",laplace_edge(img,laplace8))

低通滤波器在我们的日常生活中很有用,图像模糊,图像去噪以及图像识别都需要低通滤波器的处理。低通滤波即滤除图像中的高频部分(变化很快的部分),留下低频部分(变化不明显得到部分)。滤波器的实现有空域和频域两种:空域滤波器是直接在空间图像上进行操作,图像矩阵和滤波器算子进行卷积得到滤波后的输出;频域滤波器是将图像经过傅里叶变换到频域上再与滤波器做乘法得到输出。

阅读全文 »

This is an automatically translated post by LLM. The original post is in Chinese. If you find any translation errors, please leave a comment to help me improve the translation. Thanks!

Low-pass filters are very useful in our daily lives. Image blurring, image denoising, and image recognition all require the use of low-pass filters. Low-pass filtering means removing the high-frequency components (rapidly changing parts) from an image and leaving the low-frequency components (parts with less noticeable changes). There are two implementations of filters: spatial domain and frequency domain. Spatial domain filters operate directly on the spatial image, convolving the image matrix with the filter kernel to obtain the filtered output. Frequency domain filters transform the image to the frequency domain using the Fourier transform and then multiply it with the filter to obtain the output.

This article mainly introduces spatial domain low-pass filters and their implementations.

Commonly used spatial domain filters include average filtering, median filtering, and Gaussian filtering. Here, we mainly introduce median filtering and Gaussian filtering.

Median Filtering

As the name suggests, median filtering is a filter based on statistical methods. The specific implementation method is as follows: for an n*n median filter, the pixel value of the output image at (x, y) is equal to the median of all pixel values in the n*n image area centered at (x, y) in the input image.

The size of the filter can be selected according to different needs.

The effect of the median filter is as follows:

The original image, 3*3 median filter, 5*5 median filter, and 7*7 median filter, respectively.

Original Image
3*3 Median Filter
5*5 Median Filter
7*7 Median Filter

The original image contains a lot of irregularly distributed noise, some of which are salt noise, but most of them are sudden impulse noise. When using a 3*3 median filter, the salt noise has been removed well, but the impulse noise is still obvious. When using a 5*5 median filter, the situation of impulse noise has been greatly improved. The 7*7 filter almost completely removes the noise, but the image is also severely blurred.

Gaussian Filtering

Gaussian filtering is a commonly used blurring method in some image processing software. It is generated by the two-dimensional normal distribution function: \(p(x,y)=\frac1{2\pi\sigma^2}e^{-\frac{x^2+y^2}{2\sigma^2}}\). The specific steps are as follows:

  • To generate an n*n Gaussian filter, set the center of the n-order simulation to (0,0) and generate the coordinates of other positions in the matrix.
  • Substitute the coordinates of each position in the n*n matrix into the two-dimensional normal distribution function to obtain the value of each position.
  • Scale the values in the matrix based on the value of 1 in the upper left corner of the matrix.
  • Round the matrix to obtain the n*n Gaussian filter.

The function for generating a Gaussian filter is as follows:

1
2
3
4
5
6
7
8
9
10
11
12
import numpy as np
sigma=1.5 # Parameter of the Gaussian filter
def f(x,y): # Define the two-dimensional normal distribution function
return 1/(math.pi*sigma**2)*math.exp(-(x**2+y**2)/(2*sigma**2))

def gauss(n): # Generate an n*n Gaussian filter
mid=n//2
filt=np.zeros((n,n))
for i in range(n):
for j in range(n):
filt[i][j]=f(i-mid,j-mid)/f(-mid,-mid)
return filt.astype(np.uint8)

The Gaussian filters of size 3*3, 5*5, and 7*7 are as follows:

Their images in the two-dimensional coordinate system are as follows:

After obtaining the operator of the Gaussian filter, convolve it with the input image to obtain the result of the Gaussian filter.

The effect is as follows:

The original image, 3*3, 5*5, and 7*7 Gaussian filtering results:

Original Image
3*3 Gaussian
5*5 Gaussian
7*7 Gaussian

Gaussian blur is more effective in removing salt noise in the image, but it is more difficult to remove impulse noise. The 7*7 Gaussian filter still cannot remove the impulse noise in the image. On the other hand, compared with Gaussian blur, Gaussian blur retains more information of the image and preserves more details.

Appendix

References

[1] Digital Image Processing, Third Edition / (Rafael C. Gonzalez), translated by Ruan Qiuqi, et al. - Beijing: Electronic Industry Press, 2017.5

[2] Brook_icv. Basic Image Processing (4): Detailed Explanation of Gaussian Filters [G/OL]. Blog Garden: 2017-02-16 [2020-03-23]. https://www.cnblogs.com/wangguchangqing/p/6407717.html#autoid-4-1-0

[3] Yu Ni Xin An. Methods for Calculating Mean, Median, and Mode in Numpy [G/OL]. Blog Garden: 2018-11-04 [2020-03-23]. https://www.cnblogs.com/lijinze-tsinghua/p/9905882.html

Source Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
import cv2 as cv
import numpy as np
import math
img=cv.imread("test1.pgm")

sigma=1.5 # Parameter of the Gaussian filter
def f(x,y): # Define the two-dimensional normal distribution function
return 1/(math.pi*sigma**2)*math.exp(-(x**2+y**2)/(2*sigma**2))

def gauss(n): # Generate an n*n Gaussian filter
mid=n//2
filt=np.zeros((n,n))
for i in range(n):
for j in range(n):
filt[i][j]=f(i-mid,j-mid)/f(-mid,-mid)
return filt.astype(np.uint8)

def gauss_filter(img,n): # Perform Gaussian filtering on the image img with an n*n convolution block
filt=gauss(n)
con=1/np.sum(filt)
shape=img.shape
mid=n//2
temp=np.zeros((shape[0]+n-1,shape[1]+n-1)) # Pad the edges with zeros
for i in range(shape[0]):
for j in range(shape[1]):
temp[i+mid][j+mid]=img[i][j][0]
result=np.zeros((shape[0],shape[1]))
for i in range(shape[0]):
for j in range(shape[1]):
tmp=0
for k in range(n):
for l in range(n):
tmp+=filt[k][l]*temp[i+k][j+l]
result[i][j]=con*tmp
return result.astype(np.uint8)

def center_filter(img, n): # Perform low-pass filtering on the image using an n*n median filter
mid=n//2
shape=img.shape
temp=np.zeros((shape[0]+n-1,shape[1]+n-1))
for i in range(shape[0]):
for j in range(shape[1]):
temp[i+mid][j+mid]=img[i][j][0]
result=np.zeros((shape[0],shape[1]))
tmp=np.zeros(n*n)
for i in range (shape[0]):
for j in range(shape[1]):
for k in range(n):
for l in range(n):
tmp[k*n+l]=temp[i+k][j+l]
result[i][j]=np.median(tmp)
return result.astype(np.uint8)

filename=["test1.pgm","test2.tif"]
size=[3,5,7]
for i in filename:
img=cv.imread(i)
for j in size:
cv.imwrite(i+'gauss-'+str(j)+'.bmp',gauss_filter(img,j))
cv.imwrite(i+'center-'+str(j)+'.bmp',center_filter(img,j))

总体设计思路

视频的播放实际上就是一系列的图片按照一定的顺序,以一定的时间间隔连续播放所产生的视觉效果。因此,使用单片机驱动LCD去播放视频实际上就是让单片机以一定的时间间隔向LCD的缓存推送图片,让其不断刷新屏幕去切换图片即可。在文章的最后我放入了这个项目的源工程文件供大家参考。

阅读全文 »

This is an automatically translated post by LLM. The original post is in Chinese. If you find any translation errors, please leave a comment to help me improve the translation. Thanks!

Overall Design Concept

The playback of a video essentially involves a sequence of images displayed in a specific order at regular time intervals, creating a visual effect. Therefore, driving an LCD to play a video with a microcontroller involves pushing images to the LCD's buffer at certain intervals, continuously refreshing the screen to switch images. At the end of the article, I've included the source project files for reference.

阅读全文 »

在数字图像处理中,直方图均衡是调整图像亮度,对比度,图像增晰等操作中常用的做法。对于图像中各个不同的颜色进行直方图统计,采取统计数据对于颜色进行重新映射,从而达到调整对比度,图像增晰的目的。

阅读全文 »
0%