Helpful Unix Tools: ssh, htop, pstree, strace

The Problem

While working on my color generator project (which I have now dubbed ColorPi), I ran into an interesting issue.

I’m working on the part of my project where my Raspberry Pi lights up an RGB LED in whatever color my API has generated. My Pi is headless (meaning it’s not hooked up to a display), so doing dev work on the Pi itself is kind of annoying because I’m restricted to vim or nano, and y’all, they’re just not my thing right now. So instead, I’m coding on my laptop and using git to keep my Pi up-to-date.

Because the Pi is headless, everything I’m doing is via SSH. It’s so cool, I can communicate everything to the Pi directly from my laptop. The convenience factor of only having to deal with one keyboard and one screen instead of hopping back and forth between devices is immense, and it’s also really improved my command line skills.

However, sometimes I’ll lose the SSH connection because of something simple like accidentally closing my terminal window, or restarting my laptop, and when I SSH back into the Pi, I’ll no longer have access to the logs from the script I’m running – even though the process is still going!

Some Inelegant Solutions

The first time this happened, I rebooted the Pi and, after it started back up and I SSHed back into it, restarted my script. This option works, but it’s far from ideal. The next time it happened, I thought I’d figure out how to find the specific process that was associated with my script, and then terminate and restart it. For this, a tool like htop is very useful. There are different ways to see your running processes: ps -aux will do it, but htop gives you such a nice interactive interface!

There’s something that htop can’t help me with, though. In the above screenshot, there are 3 processes that are running because of my python deltaListener.py command. This can happen when a process spawns child processes. Which one should I kill?

To answer this, I learned about pstree, which shows the parent/child relationship between processes. By running pstree -p to get the names and PIDs of my running processes, I get the following output:

systemd(1)-+-alsactl(340)
           |-avahi-daemon(352)---avahi-daemon(406)
           |-bluealsa(634)-+-{bluealsa}(641)
           |               |-{bluealsa}(642)
           |               `-{bluealsa}(643)
           |-bluetoothd(629)
           |-cron(361)
           |-dbus-daemon(336)
           |-dhcpcd(416)
           |-hciattach(628)
           |-login(497)---bash(669)
           |-polkitd(455)-+-{polkitd}(456)
           |              `-{polkitd}(458)
           |-rngd(380)-+-{rngd}(382)
           |           |-{rngd}(383)
           |           `-{rngd}(384)
           |-rsyslogd(341)-+-{rsyslogd}(402)
           |               |-{rsyslogd}(403)
           |               `-{rsyslogd}(404)
           |-sshd(464)-+-sshd(1173)---sshd(1195)---bash(1198)---start(596)---python(597)-+-{python}(601)
           |           |                                                                 `-{python}(610)
           |           |-sshd(14486)---sshd(14508)---bash(14511)---pstree(2686)
           |           `-sshd(30093)---sshd(30344)---bash(30347)
           |-systemd(653)---(sd-pam)(656)
           |-systemd-journal(107)
           |-systemd-logind(342)
           |-systemd-timesyn(291)---{systemd-timesyn}(335)
           |-systemd-udevd(141)
           |-thd(356)
           |-udisksd(363)-+-{udisksd}(428)
           |              |-{udisksd}(439)
           |              |-{udisksd}(493)
           |              `-{udisksd}(507)
           |-wpa_supplicant(338)
           `-wpa_supplicant(469)

This is very cool! At line 20, we can see a process that says start(596). This is the bash script I created to kick off my python program so I didn’t have to keep typing the command line arguments. From there, we can see that the PID of the python process that spawned the others is 597. Now I can run kill -9 597 to stop everything so I can restart!

This is still an inelegant solution to my problem, though. Remember: I don’t really want to stop this process. I would like it to keep running, so I need a way to tap into the output it’s providing.

The Answer, and Many More Questions

At this point, I expected strace to solve my problems for me. This post on Unix Stack Exchange gave me so much hope! To whit: “If all you want to do is spy on the existing process, you can use strace -p1234 -s9999 -e writewhere 1234 is the process ID.” Sounds perfect, right?

Well, I tested it by first closing my terminal window with the ssh session and then running strace -p597 -s9999 -e write, and got this output:

strace: attach: ptrace(PTRACE_SEIZE, 597): No such process

Imaging my surprise. I returned to htop, and lo and behold, there was indeed no such process. What had happened?

Further inspection of my pstree output shows that python(597) is a result of start(596), which itself is a result of bash and then … sshd. So it looks as though by killing the SSH session, we killed all the processes it was running! The problem I thought I had, the one that sent me down this rabbit hole, turned out to not exist at all. My python script does not keep running if my SSH session dies. Is this a problem? That’s just one of many questions this new bit of information raises.

Stay tuned.

Leave a Reply

Your email address will not be published. Required fields are marked *