Linux Keylogger: How to Read the Linux Keyboard Buffer
Have you ever wanted to build a Linux Keylogger? Well it turns out quite a few people have at one point or another. A quick github search shows a bunch of results written in multiple languages. In my own journey to create one, I decided to try and document the main ideas so that anyone else could use this as a bit of a reference.
How do I find the keyboard buffer file?
All the attributes of input devices for your Linux computer can be found with the following:
cat /proc/bus/input/devices
For a full explanation you can reference this Unix Stack Exchange answer,
but what you will be looking for is EV=120013
. When you find that value we’re
going to need the eventx
value from Handlers=eventx
. You can use this
command to grab the device for you.
grep -E 'Handlers|EV=' /proc/bus/input/devices | \
grep -B1 'EV=120013' | \
grep -Eo 'event[0-9]+'
This command will output a single word, mine says event2
for example. The
keyboard file will then be located in the following location /dev/input/event2
so we can see if we’re correct by running:
sudo cat /dev/input/event2
Once you run that you can start typing keys to see if output responds. It will all be garbage on your screen that won’t be readable but that’s ok we’re at least on the right track.
How do I read this gobbledygook?
The reason that the screen appears to be unreadable is because we’re reading
binary data, like the 1’s and 0’s you hear people talk about. The problem with
reading raw binary is you can’t decipher it like you can a text file so we’re
going to use xxd
to help us understand things. Let’s try to read the data
for pressing the letter a
on our keyboard, it’s going to be a bit tricky
though because extra keypresses will generate more output so here’s how I
accomplished it:
- In a terminal run
sleep 2 && sudo cat /dev/input/event2 | xxd
- Within 2 seconds move focus to any other window so when you type the letter
a
it won’t appear in the terminal - ONLY PRESS
a
and look at memorize the address on the side, it should end at00000080
, because as soon as you start touching the keyboard it will pollute the ouput.
The following is a sample of text from when I pressed a
:
00000000: 696c 645d 0000 0000 bbd7 0700 0000 0000 ild]............
00000010: 0400 0400 0400 0700 696c 645d 0000 0000 ........ild]....
00000020: bbd7 0700 0000 0000 0100 1e00 0100 0000 ................
00000030: 696c 645d 0000 0000 bbd7 0700 0000 0000 ild]............
00000040: 0000 0000 0000 0000 696c 645d 0000 0000 ........ild]....
00000050: b0d1 0800 0000 0000 0400 0400 0400 0700 ................
00000060: 696c 645d 0000 0000 b0d1 0800 0000 0000 ild]............
00000070: 0100 1e00 0000 0000 696c 645d 0000 0000 ........ild]....
00000080: b0d1 0800 0000 0000 0000 0000 0000 0000 ................
We now have something to work with. Next we have to decipher the binary data which will take a bit of explanation.
How do I interpret the binary data?
If you don’t know how binary and hexadecimal work I’ll try to give it a super quick rundown. Each hexadecimal character (0-f) is represented by 4 bits (a bit is a 1 or 0). Here’s the conversion table:
Binary Hex
0000 = 0
0001 = 1
0010 = 2
0011 = 3
0100 = 4
0101 = 5
0110 = 6
0111 = 7
1000 = 8
1001 = 9
1010 = a
1011 = b
1100 = c
1101 = d
1110 = e
1111 = f
A “byte” of data is 8 bits (e.g 10110011
) which means that a “byte” is
represented by 2 hexadecimal characters:
1011 0011
b 3
Hexadecimal makes its easier to display larger volumes of binary data and is
also less confusing to read than a screen full of 1’s and 0’s. If you run xxd -b
then it will show it in the 1’s and 0’s instead of hex. Let’s take the first
line of the keyboard output and try and break it down:
00000000: 696c 645d 0000 0000 bbd7 0700 0000 0000 ild]............
00000000
is the address in hexadecimal.
696c 645d 0000 0000 bbd7 0700 0000 0000
1 2 3 4 5 6 7 8 9 10 1112 1314 1516
Here are the 16 bytes of data starting at 00000000
:
ild]............
This is the ascii representation of the binary (the garbage printed to the
screen earlier). Here’s an ascii table or reference. The table shows that
ascii has 128 values, but a byte can represent a total of 256 bytes. When bytes
fall above 7f
they just show up as blank. Let’s take the first 4 bytes of our
binary data and look them up on the ascii table:
69 = 'i'
6c = 'l'
64 = 'd'
5d = ']'
Everything looks hunky dory. Looking at bytes 9 -11
:
bb is > 7f so blank
d7 is > 7f so blank
07 = 'BEL' so blank
So now hopefully I haven’t confused you too much and you got a basic love for
reading binary data. Binary data means nothing unless an encoding is applied to
it, ascii is just one way to decode data. To reference how Linux stores key presses
as binary data we need to reference the input_event
struct inside the
input.h file. If you aren’t familiar with C code defines the struct
probably looks like this once evaluated:
struct input_event {
__kernel_ulong_t __sec;
__kernel_ulong_t __usec;
__u16 type;
__u16 code;
__s32 value;
}
__kernel_ulong_t
is 64 bits, __u16
is you guessed it 16 bits and I’m sure
you can figure out how many bits are in a __s32
. All of this data will be
stored sequentially in memory but lets break down what it stores.
type
: represents an event type, as defined in input-event-codes.h which
you can see is included in input.h, and we will want to look for EV_KEY
for a key press, which is 01
.
code
: represents the key that was pressed, again defined in
input-event-codes.h. We are pressing a
which is KEY_A
is defined as 30
(in decimal) which is 1e
in hexadecimal. NOTE that 1e
is not the same as the
ascii representation of a
, which is 61
(ascii table reference)
value
: represents whether the key was pressed or released. This is pretty
simple as 1
means pressed and 0
means released.
So lets build out an example of the data that we’re looking for in the stream for a keypress event:
Code
Type |
|----- seconds ---| |---- useconds ---| | | |-Value-|
0000 0000 0000 0000 0000 0000 0000 0000 0001 001e 0000 0001
64 bits 64 bits 16b 16b 32 bits
Skipping the seconds and usecs lets look at some of the output I saved from before let’s scan binary representing (type, code, value):
00000010: 0400 0400 0400 0700 696c 645d 0000 0000 ........ild]....
00000020: bbd7 0700 0000 0000 0100 1e00 0100 0000 ................
^ this looks similar to our example bytes
But wait? The data is kinda backwards? What gives?
0001 001e 0000 0001 <- Our "constructed" data
0100 1e00 0100 0000 <- Data from the buffer
Maybe your machine doesn’t have the backwards, that means you have a Big Endian system! If yours is backwards like mine you have a Little Endian system. How does this work? Big endian store numbers left to right with most significant byte (big end) is at the lowest address. Little endian stores the least significant (little end) and the lowest address. Let’s look at a couple examples:
16 bit big endian
0123
1 2
16 bit little endian
2301
2 1
32 bit big endian
0123 4567
1 2 3 4
32 it little endian
6756 2301
4 3 2 1
So our example bytes of:
0001 001e 0000 0001
1 2 1 2 1 2 3 4
Gets stored as:
0100 1e00 0100 0000
2 1 2 1 4 3 2 1
Which matches the line from our keyboard buffer, SUCCESS! How about the seconds? Linux stores seconds as timestamps and we know that they take 64 bits so looking at the output we can see the seconds are stored:
|-- Starting here
00000010: 0400 0400 0400 0700 696c 645d 0000 0000 ........ild]....
00000020: bbd7 0700 0000 0000 0100 1e00 0100 0000 ................
696c 645d 0000 0000
It’s little endian so the actual value is
696c 645d 0000 0000
1 2 3 4 5 6 7 8
0000 0000 5d64 6c69
8 7 6 5 4 3 2 1
And now we can convert 5d64 6c69
into a decimal number with this
calculator, which is 1566862441
. You can take that and use a timestamp
converter to get: 08/26/2019 @ 11:34pm (UTC)
. Pretty cool huh? The rest
of binary output I didn’t go deep enough to figure it out but most of the
different code repos that I was looking at while researching this just skipped
skipped that data. Hopefully you had some fun getting some insight into how
keyboards work in Linux.