Up to this point, we’ve discussed serial communication errors caused by mismatched data rates and poor accuracy built-in oscillators. The machining of the project enclosure for a tool that tests such conditions is described on the previous page.
This final page of the article describes a strange serial problem that I ran into that turned out to be unrelated to crystals or baud rate. I’m documenting the issue to save you the frustration and hours of debugging that it took me to discover the cause.
The symptom is a sporadic corruption of one or more characters during serial transmission. In this example, the carriage return and linefeed characters are mangled, but I saw long series of other characters go bad.
Corrupt CR LF characters
The text above shows the page number of an LCD screen advancing each time a button is pushed. The current screen number and the button statement are supposed to be on their own lines.
I tried of series of corrective actions and diagnostic techniques.
Note: You know you’re desperate when you start believing the chip maker made a mistake.
Somewhere along the way I dragged out the digital logic scope to compare the output of a good project to the one with the transmission errors. Here is what a good carriage return and linefeed serial sequence looks like:
Valid CR LF logic trace.
Here are three examples of bad carriage return and linefeed serial sequences:
Corrupt CR LF Logic Traces
I noticed the following about the corrupt serial sequences:
What could cause the ATmega168’s built-in serial hardware (USART - Universal Synchronous and Asynchronous serial Receiver and Transmitter) to pause in the midst of a bit and then continue as though nothing had happened? Let’s look at the clock source for the USART. In my case, it is the system clock. What can pause the system clock?
Well, it turns out that noise reduction mode on the analog to digital converter (ADC) pauses many of the clocks, including the serial hardware. I had a timer that would occasionally go off to measure the voltage of a potentiometer. If the measurement occurred when serial data was being transmitted, the ADC conversion paused the clock used by the USART, thus stretching the current bit. Because asynchronous serial communication relies on exact timing, the stretched bit corrupted that character and any characters that immediately followed.
The solution is to either wait until all characters have finished transmitting before performing a low-noise analog conversion, or to perform a standard analog conversion. For the purposes of this project, low-noise was unnecessary.
In summary, if your serial transmission is randomly corrupted, and you see a stretched bit on a serial logic trace, then check you code for anything the pauses or sleeps the clock source. You might want to do that before you resolder every point, switch chips, and so on.
I hope that this article provides you with insights into the most common form of device-to-device communication: asynchronous serial. Also, I hope you find the crystal speed vs baud rate table helpful. Lastly, consider using the timer input capture hardware if your hardware has it. It is surprisingly accurate and less work than polling a pin.