Truth, Computing and Fail

  • Home
  • About

Symlinks in a libfs virtual file system: The Pains

anomit | January 7, 2010

The only documentation you have got for libfs is the code itself, which says a lot about the hoops I had to jump through to at least get this thing working :)

I still don’t claim that I know all the innards of libfs regarding this functionality. It took me around 3 days, poking around the code and quite a few kernel freezes and oops to figure this thing out. Also to be noticed is the fact is that this is for a non disk-based file system like /proc. I guess things will be different when it’s done on file systems like ext3.

WARNING: This is quite a long drawn post as I haven’t presented the solution in a straight forward manner. I have rather chosen to ramble on about my personal experience. For a no-nonsense, tldr-free answer, visit this

To begin with, I was absolutely clueless. All this time I had been using `ln -s` or symlink() without worrying about what really goes on under the hood. Imagine my frustration when ln -s didn’t work on the file system. I had completely neglected this part and thought that it’d be somehow taken care of automagically.

Going through a few online resources, I managed to deduce that the symlink operation takes place on inode level (whatever you make of that). Looking through the fields of struct inode , I found struct i_op for inode operations. Referring to both the book Understanding Linux Kernel and the source code for the fields in i_op, I found out a few function pointers that had something to do with links, namely symlink(), readlink(), follow_link().

Initially, I thought of implementing getattr() so that it’d return a S_IFLNK for the symlink but the idea of having to handle attribute generation for the rest of the dentry objects was too much for my puny brain and this plan was discarded.

Reading through man 2 symlink, I came across this and instantly everything was clear:

Symbolic links are interpreted at run time as if the contents of the link had been substituted into the path being followed to find a file or directory.

First, I changed the file creation function a bit for the symlink so that the proper st_mode is set and put the target location string in the i_private field of the inode structure.

Then I created a inode_operations structure and put in the following function definitions:

static void *sample_follow_link (struct dentry *dentry, struct nameidata *nd)
{
    nd->depth = 0;
    nd_set_link(nd, (char *)dentry->d_inode->i_private);
    return NULL;
}

static struct inode_operations sample_inode_ops = {
    .readlink = generic_readlink,
    .follow_link = sample_follow_link,
};

....
//in the function for the dentry and inode creation
inode->i_op = sample_inode_ops

But it was the second step that actually took up the most of my time. I read the man page of readlink and naively put in my own readlink implementation.

int sample_readlink(struct dentry *dentry, char __user *buffer, int buflen)
{
    unsigned long ret = copy_to_user(buffer, dentry->d_inode->i_private, buflen);
    if (ret == 0)
        return buflen;
    else
        return -EFAULT;
}

Kernel freezes happened and wailing of wolves were heard in the distance. Putting in just generic_readlink didn’t solve the problem either with kernel oops happening every time ls was run. Looking into the function definition of generic_readlink in the kernel source, I found it’s mentioned in the corresponding comment block that follow_link needs to be implemented for it to work. Looking into the code for ext2 gave me some idea about dealing with struct nameidata. Problem solved :)

Comments
1 Comment »
Categories
Coding, GNU/Linux
Tags
c, file systems, inode, kernel, libfs, linux, syscalls, vfs
Comments rss Comments rss
Trackback Trackback

Kernel module debugging: a simple technique

anomit | November 4, 2009

Disclaimer: I have only started out with developing kernel modules and even novice would be an overstatement to describe my current skills. What follows is stuff I gathered from different sources while trying to debug a kernel oops due to a module: some googled, some from the LDD3 book which finally put together gives more or less a basic strategy to start with debugging a kernel module.

As I figured out from reading LDD3, you can use one of these tools to debug a module

  • plain ol’ gdb
  • kgdb
  • kdb

kgdb doesn’t really strike me as something I will be needing in the near future. I’m quite sure I won’t be taking the trouble to find another system to set up a debug session. But for all I know it might be invaluable to those involved in serious work.

kdb requires you to patch the kernel. I’ll admit I didn’t try this out of sheer laziness.

gdb should be a part of the arsenal of even a half-serious programmer and in my case, it was. There are just a few things that need to be in place before you start using it. First, you need the uncompressed kernel image, vmlinux (not vmlinuz). Second, you need to compile the kernel with some extra options to help you with debugging. This one is again from the LDD3 book, Chapter 4.

CONFIG_DEBUG_KERNEL* 
CONFIG_DEBUG_SLAB
CONFIG_DEBUG_PAGEALLOC
CONFIG_DEBUG_SPINLOCK
CONFIG_DEBUG_SPINLOCK_SLEEP
CONFIG_INIT_DEBUG*
CONFIG_DEBUG_INFO*
CONFIG_MAGIC_SYSRQ
CONFIG_DEBUG_STACKOVERFLOW
CONFIG_DEBUG_STACK_USAGE
CONFIG_KALLSYMS*
CONFIG_IKCONFIG*
CONFIG_IKCONFIG_PROC*
CONFIG_ACPI_DEBUG
CONFIG_DEBUG_DRIVER
CONFIG_SCSI_CONSTANTS
CONFIG_INPUT_EVBUG
CONFIG_PROFILING*

It’s not that all of these are absolutely necessary to get any kind of debugging work done but you never know what kind of oops/kernel panic you might be facing. Still I have starred the ones that I feel *must* be enabled. But don’t go by my words, compile and recompile to find out the truth ;)

With all the yak mowing out of the way, you can finally start debugging the module with your freshly recompiled kernel.

Start the debugger with
#gdb /usr/src/linux/vmlinux /proc/kcore

But gdb doesn’t yet know where to find the module’s code and data sections. You can either do it manually by going into /sys/module/module_name/sections, cat-ing the values of .text, .data and .bss and then this command at the gdb prompt

(gdb)add-symbol-file /path/to/module 0xd081d000 \  # .text
-s .data 0xd08232c0 \
-s .bss 0xd0823e20

or this shell script will output the whole command for you:

#!/bin/bash
#
# gdbline module image
#
# Outputs an add-symbol-file line suitable for pasting into gdb to examine
# a loaded module.
#
cd /sys/module/$1/sections
echo -n add-symbol-file $2 `/bin/cat .text`

for section in .[a-z]* *; do
    if [ $section != ".text" ]; then
	echo  " \\"
	echo -n "	-s" $section `/bin/cat $section`
    fi
done
echo

This information is again thanks to the LDD3 author Corbet, from this article. What would I have done without his book and articles?!

The module I was trying to debug was causing an oops due to null pointer dereferencing, which actually has been the source of quite a few vulnerabilities in the mainline kernel source. The following is what it looked like (got it from dmesg)

[27570.020736] BUG: unable to handle kernel NULL pointer dereference at 00000018
[27570.020747] IP: [<e07b3c31>] :plan9_net:socknet_connect+0xd1/0x110
[27570.020760] *pde = 00000000
[27570.020767] Oops: 0000 [#1] SMP
[snip]
[27570.020939]
[27570.020945] Pid: 8622, comm: bash Tainted: P          (2.6.27-14-generic #1)
[27570.020951] EIP: 0060:[<e07b3c31>] EFLAGS: 00010296 CPU: 0
[27570.020960] EIP is at socknet_connect+0xd1/0x110 [plan9_net]
[27570.020966] EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00f60000
[27570.020971] ESI: de4182a8 EDI: 00000002 EBP: dddedf20 ESP: dddedef4
[27570.020977]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[27570.020983] Process bash (pid: 8622, ti=dddec000 task=c198f110 task.ti=dddec000)
[27570.020988] Stack: 00000000 de4189f0 de418308 17000002 0101a8c0 d0f85494 d0f85493 de4ed200
[27570.021004]        dda077c0 d0f85494 d0f85493 dddedf54 e07b37d5 00000000 dd4fd340 dda077c0
[27570.021019]        0000000e dd4fd340 de4182a8 d0f85493 dd4fd340 d0f85480 09582808 caaa4540
[27570.021035] Call Trace:
[27570.021041]  [<e07b37d5>] ? tcp_n_ctl_process+0x145/0x170 [plan9_net]
[27570.021053]  [<e07b3505>] ? slashnet_write_file+0x185/0x190 [plan9_net]
[27570.021070]  [<c01b2c70>] ? vfs_write+0xa0/0x110
[27570.021081]  [<e07b3380>] ? slashnet_write_file+0x0/0x190 [plan9_net]
[27570.021092]  [<c01b2db2>] ? sys_write+0x42/0x70
[27570.021101]  [<c0103f7b>] ? sysenter_do_call+0x12/0x2f
[27570.021110]  [<c0380000>] ? __down_killable+0x60/0xd0
[27570.021121]  =======================
[27570.021124] Code: 7b e0 bb ff ff ff ff e8 cc ab bc df eb 8f 8b 58 0c 8d 55 e0 b9 10 00 00 00 c7 04 24 00 00 00 00 ff 53 10 89 c3 8b 45 f0 8b 40 14 <8b> 40 18 c7 04 24 5c 3e 7b e0 89 44 24 04 e8 9a ab bc df 85 db
[27570.021211] EIP: [<e07b3c31>] socknet_connect+0xd1/0x110 [plan9_net] SS:ESP 0068:dddedef4
[27570.021235] ---[ end trace 1d54537d6fc8b3bc ]---

Phew that’s a lot of information! You get a dump of all the register values, the stacktrace, codetrace etc in an oops message. I’ve given a couple of links at the end that deal with all the information present. Refer to them for more details.

For now, we can see that something was executed in the socknet_connect section at an offset of 0xd1 which caused the null pointer dereference. We’re very close to finding out the errant piece of code now. Just do the following in the gdb prompt to home in right on the culprit statement :
(gdb)list *socknet_connect+0xd1

..and we are done! Pretty simple and basic, wasn’t it?

These two links are really good for pointers on how to look for the necessary information in an oops message

  • Re: what’s an OOPS by John Bradford from LKML
  • A very detailed oops report analysis that’ll really help you with ‘how to get from bug report to the source of bug’

I’ve been also trying to use the offset information with the disassembled module to figure out which part of the source code it might actually correspond to. I haven’t met with much success though.

Comments
No Comments »
Categories
Coding, GNU/Linux
Tags
c, Coding, kernel, linux, module
Comments rss Comments rss
Trackback Trackback

Adding a new system call to the linux kernel

anomit | April 6, 2009

I tried this thing last semester too but I wasn’t too serious about it. I had decided to go for gentoo for obvious benefits that’d support the frequent rebuilding of the kernel. Somewhere down the line gentoo got caught in a cyclic dependency error and I forgot about the whole thing. But I am digressing.

Anyway, I built gentoo from scratch and got things working. This step by step guide is quite good to get started. Note that this is about adding system calls to the kernel, not implementing them.

The guide is a bit old though, and just one thing needs to be changed. Step #16 mentions the use of the _syscallN macro. Don’t use it. From the man page of _syscall

NAME

_syscall – invoking a system call without library support (OBSOLETE)

NOTES

Starting around kernel 2.6.18, the _syscall macros were removed from header files supplied to user space. Use syscall(2) instead.

The _syscall() macros do not produce a prototype. You may have to create one, especially for C++ users.

Instead use a function wrapper like this:

long mycall(int i, int * result)

{

	return syscall(__NR_mycall, i, result);

}

i and result are the arguments I used for my syscall and quite obviously it would vary according to whatever you decide to write.

There are quite a few other guides too on this topic but they are generally old and not updated. So in any case you do need to poke around quite a bit to get things working.

Some really good reading material:

  • Kernel command using Linux system calls ( uses the _syscallN macro in examples )
  • Playing with the cr0 register. This is a bit advanced for my current knowledge level and I’m in the process of fully understanding how the register works. Try at your own risk.
Comments
No Comments »
Categories
Coding, GNU/Linux
Tags
hacking, kernel, linux, syscall
Comments rss Comments rss
Trackback Trackback

What’s in

  • Symlinks in a libfs virtual file system: The Pains
  • Small rant on the FUSE API reference
  • Kernel module debugging: a simple technique
  • Vim/Cscope quickie
  • PyCon India or Code Jam?

Blogroll

  • Akshay Kothari
  • Ankur Shrivastav (OS)
  • Ankur Sinha
  • Harsh J
  • Hullap
  • LUG manipal
  • Swap

Tags

aircrack airfail airtel assembly blues build c Coding college country cryptography dean faculty file systems fuckery gnuplot hacking India kernel linux mangalore manipal mpd music NASM plugin plugins politicians pub culture python rant rock sam scheduler simulation SSFNet stupidity supernatural suppression syscall syscalls unix vim xchat xml

Archives

  • January 2010
  • December 2009
  • November 2009
  • October 2009
  • September 2009
  • July 2009
  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • January 2009
  • November 2008
  • September 2008
  • August 2008
  • July 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • February 2008
  • January 2008
  • October 2007
  • September 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007
  • March 2007

License

Creative Commons License
This work by Anomit Ghosh is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 2.5 India License.
rss Comments rss valid xhtml 1.1 design by jide powered by Wordpress get firefox