Sunday, November 12, 2017

Composite (NTSC) Video on mbed Nucleo (stm32f401)



This weekend, I put together a demo program for an STM32F401 development board that generates a composite video output.  My development board didn't have a DAC, and I needed three different output levels, so I used two digital output pins and resistors.  This solution isn't incredibly robust - different monitors and displays require slightly different resistor values to function correctly.  I found that a ratio of around 1 to 2 worked pretty well.  The SYNC pin should generate a 0.3V signal at the composite input, and the SYNC and VID pins should generate a signal in the range of 0.7 to 1.0V at the composite input.


The goal of this project was to generate NTSC video output using only the mbed libraries.  I'm sure it's possible to do a much better job if the timers and interrupts are configured manually, but that's a lot of work...  To learn about NTSC, I used this PDF (link).  The basic idea is that each horizontal line begins with a synchronization signal, followed by the data for that line.  Each line is around 63 microseconds long, meaning you'll need more than 1 microsecond timing resolution if you want more than 63 horizontal pixels.  After all the horizontal lines are scanned, including a few bonus ones that don't end up shown on the screen, there is a vertical sync pattern, which starts the entire process over. The sub-microsecond resolution turned out to be quite an issue - mbed timing functions are based off of 1 microsecond timers, so I needed to get creative. Also, the mbed "Ticker" class fails to time accurately (around 20 microseconds of jitter) if more than one Ticker is in use, so I could have exactly one accurate source of timing. 


Here's what a single horizontal line looks like:
The vertical sync pattern is quite complicated:

Getting the v-sync timing reliable on multiple monitors turned out to be incredibly challenging, so I eventually wrote a stupid program which slowly adjusts the waveform, and just watched the screen until it worked.  I found that a much simpler v-sync pattern was sufficient.

Due to the 1-microsecond resolution limit of the default mbed library, I was unable to set up per-pixel timing.  Horizontal line timing used a Ticker running every 63 microseconds.  This is slightly faster than the 63.5 microsecond NTSC standard, but it seems to work.  64 microseconds did not.  The ISR is surprisingly simple:

void isr()
{
//SETUP
    uint8_t nop = 0; //use nops or use wait_us
    uint8_t* sptr; //pointer to sync buffer for line
    uint8_t* vptr; //pointer to video buffer for line
    if(l < V_RES){ vptr = im_line_va + ((l/4)*H_RES); sptr = im_line_s; nop = 1; } //pick line buffers
    else if(l < 254){ vptr = bl_line_v; sptr = bl_line_s; nop = 0; }
    else{ vptr = vb_line_v; sptr = vb_line_s; nop = 1;}
    uint8_t lmax = nop?H_RES:12; //number of columns
//DISPLAY
    for(uint8_t i = 0; i < lmax; i++) //loop over each column
    {
        vout = vptr[i]; //set output pins
        sout = sptr[i];   
        if(nop) //nop delay
        {
            asm("nop");asm("nop");asm("nop");asm("nop");asm("nop");asm("nop");asm("nop");//asm("nop");asm("nop");asm("nop");asm("nop");asm("nop");asm("nop");asm("nop");asm("nop");asm("nop");
        }
        else {wait_us(1); if(i > 2) i++;} //wait delay
        
        
    }
    //move to next line
    l++;
    if(l > 255) l = 0;
}


The ISR gets slightly more complicated because it uses two different timing strategies:

  1. "nop": A number of "nop" instructions are run, delaying for an exact number of CPU cycles.  This is very accurate, but hogs the CPU and prevents other contexts from running.
  2. "wait_us":  This is low resolution (can only do multiples of 1 us, which is 1/60 of a horizontal scan), and low accuracy (sometimes waits too long).  With only this method, I managed to get 18 horizontal pixels.  However, it allows other tasks to run in the background while it is waiting - a modern microcontroller can do a ton in 1 microsecond.


There are three cases for setup:

  1. The line number is less than the vertical resolution.  If this is the case, store the memory location of the current line of the image buffer in vptr, and store the memory location of the video synchronization pattern in sptr, and choose the "nop" timing.  More on this later.
  2. The line number is greater than the vertical resolution, but less than 254. In this case, prepare to display the blank patterns for video and sync.  Don't use "nop" timing, and use a horizontal resolution of 12.
  3. The last line: prepare for vertical sync.

In the display section, the video data loaded into vptr and sync data loaded into sptr are written to the VID and SYNC pins.  When important data (vertical sync and the actual video) is being sent, timing is done with a series of "nop" instructions.  For the less important signal (displaying "blank" on the bonus lines that don't show up on the screen), timing is done with a wait_us(1) command.  These wait_us(1) commands are very important - when using nop timing, the ISR takes around 62 microseconds to execute, leaving almost no time for other processing to be done. During the wait_us(1) command, the microcontroller is free to switch contexts and execute other code.  The wait_us function is terrible, and occasionally waits 2 or even 3 microseconds, so the horizontal resolution has to be reduced to 7.  This low resolution would look terrible when displaying video, so we can only use this trick when the stuff being drawn is off screen.


In the background, the microcontroller is busy updating the image buffer to display other items.  I have implemented code to translate and rotate 3D points, draw lines between points, draw checkerboards, draw simple bitmapped images, and calculate the position of a very simple bouncing ball (inspired by this).

Sadly, none of the rest of the code can use any sort of timing because the performance of the mbed Ticker class becomes too poor to have a stable video output when multiple Tickers are running, so it is adjusted to be more or less efficient so that the demo runs at a reasonable speed.  This involved doing all sorts of terrible modifications, compiling, loading, testing, and readjusting.  I don't know what compiler is used by the online IDE, other than it isn't gcc (or at least the error messages don't match gcc), and I suspect that it compiles with no optimization flags, so some code includes optimizations I'd normally rely on the compiler to do, and some code is intentionally not optimized to run more slowly.  For whatever reason, the code to generate the checkerboard was incredibly slow compared to everything else, including lots of floating point math used to rotate the cube and draw lines.  In the end, the version that was fast enough, but not too fast, looked like this:


void draw_v_check(int8_t r,uint8_t tt)
{
    for(int i = 0; i < H_RES; i++)
        for(int j = 0; j < V_RES; j++)
im_line_va[i+j*H_RES] = (((i > 20) && (i < 98)) && ( tt ^(((j%(r*2))>=r) ^ ((i%(r*2)))>=r)));
}

In the end, the silly demo looks like this:


 and the full code, including a very poorly-written demo:
https://github.com/dicarlo236/mbed-video/blob/master/main.cpp


Lesson learned: Don't use the mbed libraries for things that require complex timing!  This code is a disaster.

Thursday, September 21, 2017

Playing MIDI on an STM32F446 Part 1: Square Waves

I've decided to try to make an 8-bit style music player out of a microcontroller. There are some really cool arduino versions of this project, but they all seem to require a very confusing custom file format to describe the song. My goal is to use a simple format that can be easily generated from a midi file.  As a proof of concept, I wrote some code which plays a midi file converted to a list of note data in the format "start time, duration, pitch, volume".

This version has a known issue with high pitch notes. If a note requires a duty cycle of 80.5 DAC cycles, it will just round down to 80 cycles, which makes it sound flat. Instead, it should alternate between 80 and 81.

Eventually, I plan to add more sounds and effects other than "square wave with 50% duty cycle", but for now, that's the only choice. The plan is to make each instrument kind of like a script. Each instrument would have an array of function pointers and arguments which get executed in order. Functions could do things like "delay 20 instrument cycles", "set output to triangle wave", "increase pitch by major third", "do a vibrato effect", or even "move function pointer array pointer back n steps".

I plugged the DAC into my computer's line in port, and recorded this:
https://www.youtube.com/watch?v=bGQ_Zj2jo8s

Saturday, July 22, 2017

Hoverboard Robot Arm Part 1: The Class Project

I decided to build a robot arm for my 6.115 final project, which, in retrospect was a bad idea.  The project is open ended, but requires you to use an Intel 8051 and a Cypress PSoC microcontroller.  The 8051 is from the 1980's and is useless, and the PSoC is only there because Cypress gave MIT a ton of money.  The PSoC is honestly a very bad microcontroller - it's neither high performance nor cheap, but it's still way better than the 8051.  It's unique feature is that there's some sort of vaguely FPGA-like thing that allows certain pins to be remapped.  As a result it does many things, but nothing well.  It is very slow at floating point math.  The analog inputs are slow.  It can literally do half the PWM and encoder decoding of the similarly priced ST micro.  It has no hardware serial support.   Cypress donated development kits, so the class uses PSoC microcontrollers as the example of modern microcontrollers.

At least in my opinion, most project-based classes involve projects that are on the boring side, and are technically not that interesting.  6.115 is the exact opposite - the final projects are supposed to be very ambitious.  My original final project proposal included the following

  1. Designing a 3-phase inverter capable of running at 400V, 20A with low side current sensing
  2. Laying out a PCB for the motor controller
  3. Populating/debugging the PCB
  4. Implementing field oriented motor control for torque control of a brushless motor
  5. Designing a discrete-time current controller
  6. Characterizing a motor from an electric hoverboard
  7. Designing a full state feedback controller for robot arm endpoint position control
  8. Implementing code to compute the robot arm's Jacobian to have force control of the robot arm
  9. Designing and fabricating an extremely low inertia, direct drive robot arm
  10. Simulating the robot arm force/torque control in MATLAB
  11. Writing a motor simulator to test the field oriented control logic
  12. Coming up with a robust communication protocol for the motor controller to talk to the 8051
  13. Hooking the 8051 to the timer chip, the serial chip (to talk to the motor controller), two ADC's (for controlling the robot arm), a keypad encoder (for changing settings), and one of those Hitachi displays (for showing position error)
which pretty much hit everything covered in the class (motors, current control, first order circuits, feedback control, switching regulators, feedback control of switching regulators, stepper motors, digital signal processing, digital filters, and characterizing systems).  The suggestion was to add code that takes in a JPEG, vectorizes it, then draws the image.  On an Intel 8051.  


This project was submitted as a "safe" robot arm with "small" electric motors, driven by "little 3 phase inverter ICs" powered from a "low-power" bench supply. There's a huge number of safety requirements for the project, which is pretty reasonable given some of the proposed ideas. The "safe" and "small" motors are around 1.5 kW motors from an electric hoverboard, the "little" IC is actually a 75A, 600V IGBT in a package that can dissipate 225W, and the "low-power bench supply" is, well, actually a completely reasonable 3A, 15V bench supply that we're required to use.

We were taught in class that using words like "low-power" or "large" were relative and meaningless in engineering, so I didn't feel too bad about submitting the robot arm.  For the project checkoff, I ran the arm with a power supply limited to ~5 watts, and had somebody inspect the arm to make there wasn't a hidden boost converter with dangerously high voltage, so it wasn't like there was actual danger.

Building the Arm

Bayley recently bought 15 hoverboards from a man in New York, and gave me a few motors to use for this project.  I started by removing the tire from the motors and disassembling.

There's a lot of motor inside!  Unfortunately, I couldn't come up with a clever way to hide an encoder on the inside of the motor, so I bored a hole in the casing to give me access to the fixed shaft from the front of the motor.  There aren't very many circular features on this motor, so finding the center is challenging.  I ended up putting it in a four-jaw chuck on the lathe, but was only able to get within .001" because nothing on this thing is actually round.  

During the next week, I made all the brackets and clamps.  Some were made on the MITERS CNC
and other were made on the Haas VF-2 next door

The last three parts were done on all manual equipment because the CNC's were in use

After a week of machining, I had bunch of shiny aluminum parts

Next,  I found some thin wall tube laying around MITERS and assembled the two arms


Electronics

The controller for this arm is overkill.  The IGBT module (FNA27560) is good for 50A, 400V continuous with good heatsinking, so it's more of an electric vehicle controller than a robotics controller.  I created a schematic inspired by the reference design for the IGBT module and added a microcontroller and differential serial.  There are no power supplies on the board.  I'm not sure if this was a good decision or not, but I planned to put the 5V, 3.3V, 15V supplies on a separate board that connected to both boards.  Unfortunately, I wasn't allowed to use the ST microcontroller for the class project as the class was bought out by Cypress and forbids knowledge of other companies has received generous donations of money and parts from Cypress.  Instead I was forced to decided to take advantage of the Cypress Programmable System on a Chip 5LP microcontroller development platform, and created a controller that was in every way worse than the ST based controller was easy to implement with the Cypress "PSoC Creator" software (a 3GB IDE that vaguely resembles Office 2003).   

The board I drew up was a little bit scary - there's not much ground plane and the micro is directly under the IGBT. The reference design seemed incredibly conservative and only put components on the top layer, so I moved things around a bit to get a tighter layout.  
"Small" Power ICs

Around 6 days after placing the order to 3-PCB, the boards arrived.  I somehow screwed up the gerber file generation so I missed mounting holes, heatsink holes, and my text.  Populating the board was uneventful.  I screwed up and bought a through hole diode instead of the correct surface mount one (not sure how I screwed up that bad) and the holes for the large IGBT module pins were slightly too small, requiring the pins to be filed down.  

Fiddling with PSOC Creator to give me the center-aligned PWM with deadtime took an hour or so, which isn't too bad.  I ended up with my board in a state where the clock is set "too fast" for reliable programming, so it fails to erase flash one in four times, causing the whole PSOC Creator program to throw endless memory out of bounds errors until you restart it.   It's not a well written application.

The trick to aligning the PWMs is to set the counter reset on a control register.  When  toggling the control register, all three counters zero at exactly the same time, aligning the PWMs. 

Once I had control of the inverter with the PSOC, I fed open loop sinusoidal phase voltages into the motor:

Next, I got the DAC working.  The PSOC has no easy to use print statements for debugging (piece of junk), so the DAC is the main debugging output.  If I add debugging serial, I need to provide my own USB to serial adapter, and I use a ton of CPU power to implement the software serial, which prevented the FOC loop from running fast enough.  Once I configured the quadrature encoder "block", I wrote a little automatic calibration routine.    The motor spins open loop until it hits an index pulse.  To figure out the electrical offset, it raises one phase high and leaves the other two low, causing the motor to lock at the d axis.  From here, the microcontroller calculates the electrical offset and applies q axis volts, causing the motor to speed up.  Here's a video (don't forget your safety tupperware when doing class projects):


I wrote some other stuff with some nice pictures, but google blogger is terrible and messed up the formatting, so I got frustrated and copied everything into a video.  Watch as I slowly lose interest in the project and things get crappier and crappier.  The robot arm gets clamped to an old cart and the electronics get really bad.




In the end, after failing at least 3 different safety inspections, I finally added  a bunch of clamps and polycard shields, hid it in the corner of the lab underneath unused benches, and passed the final safety inspection the day before checkoff.  I demoed it slowly moving in a circle to a TA for 3 seconds and got full points for my project demo.

In the future, I do not plan to come back to this project.




Monday, May 22, 2017

Parrot Drone Reverse Engineering to View Camera Stream, Edit Settings, and Flash LED

My friend had a camera board out of some old Parrot quadcopter thing.  When powered on, it creates a wi-fi network that you connect to with your smartphone.  Then user downloads an app on their smart phone, and can control the quadcopter and view the video. As a fun project, I decided to find out how easy it is to view the camera feed without an app and see how much was open.  In the end, the project turned out to be really easy - there's no security.

We powered on the board from a 12V power supply and waited for it to boot.  It created a wi-fi network with a visible SSID.  I connected to the network with my laptop running  MATE 14.04 with DHCP enabled and was assigned an IP address of 192.168.1.3.

The IP address of the camera board was found with
arp -a
which was 192.168.1.1 for mine.

To see which ports were open, I ran 
nmap 192.168.1.1
which revealed three open ports: ftp, telnet, and port 5555, which nmap (incorrectly) identified as some sort of multiplayer Linux game.

The ftp server accepted an anonymous connection, but put me in an empty directory.  I'm assuming it's for sending firmware updates, so I didn't try much else from here.

Next, I investigated the mysterious port 5555.  I first pointed windows media player at this port, and nothing happened.  Same with VLC.  Running
ffplay tcp://192.168.1.1:5555/
gave me a ~25 fps, 320x240 video stream with around 5 seconds of latency from the big camera.  You'll have to add the PPA for ffmpeg and install ffmpeg if you're on 14.04 - ubuntu was dumb and switched to libav, which I still haven't gotten around to learning about.

Finally, I connected with telnet and got a busybox bash shell.  The board has a fairly complete basic Linux install and has things like grep and vi.  You can toggle the red/green on the LED just by running `export 1 > /.....path_to_gpio....`.  The root user has no password, so you have full access to the system.  This is usually a good idea - ssh doesn't let users without passwords connect, but they set up telnet.  From a security standpoint, the telnet without password is really dumb.  If somebody malicious gets root access, you're screwed.  They can delete everything on the device, retune your gains, or even mess with settings for the device's power management, which stand a good chance to damage the hardware.  Even if the remote user executed rm -rf /, the device is unrecoverable for consumers.