Truth, Computing and Fail

  • Home
  • About

Symlinks in a libfs virtual file system: The Pains

anomit | January 7, 2010

The only documentation you have got for libfs is the code itself, which says a lot about the hoops I had to jump through to at least get this thing working :)

I still don’t claim that I know all the innards of libfs regarding this functionality. It took me around 3 days, poking around the code and quite a few kernel freezes and oops to figure this thing out. Also to be noticed is the fact is that this is for a non disk-based file system like /proc. I guess things will be different when it’s done on file systems like ext3.

WARNING: This is quite a long drawn post as I haven’t presented the solution in a straight forward manner. I have rather chosen to ramble on about my personal experience. For a no-nonsense, tldr-free answer, visit this

To begin with, I was absolutely clueless. All this time I had been using `ln -s` or symlink() without worrying about what really goes on under the hood. Imagine my frustration when ln -s didn’t work on the file system. I had completely neglected this part and thought that it’d be somehow taken care of automagically.

Going through a few online resources, I managed to deduce that the symlink operation takes place on inode level (whatever you make of that). Looking through the fields of struct inode , I found struct i_op for inode operations. Referring to both the book Understanding Linux Kernel and the source code for the fields in i_op, I found out a few function pointers that had something to do with links, namely symlink(), readlink(), follow_link().

Initially, I thought of implementing getattr() so that it’d return a S_IFLNK for the symlink but the idea of having to handle attribute generation for the rest of the dentry objects was too much for my puny brain and this plan was discarded.

Reading through man 2 symlink, I came across this and instantly everything was clear:

Symbolic links are interpreted at run time as if the contents of the link had been substituted into the path being followed to find a file or directory.

First, I changed the file creation function a bit for the symlink so that the proper st_mode is set and put the target location string in the i_private field of the inode structure.

Then I created a inode_operations structure and put in the following function definitions:

static void *sample_follow_link (struct dentry *dentry, struct nameidata *nd)
{
    nd->depth = 0;
    nd_set_link(nd, (char *)dentry->d_inode->i_private);
    return NULL;
}

static struct inode_operations sample_inode_ops = {
    .readlink = generic_readlink,
    .follow_link = sample_follow_link,
};

....
//in the function for the dentry and inode creation
inode->i_op = sample_inode_ops

But it was the second step that actually took up the most of my time. I read the man page of readlink and naively put in my own readlink implementation.

int sample_readlink(struct dentry *dentry, char __user *buffer, int buflen)
{
    unsigned long ret = copy_to_user(buffer, dentry->d_inode->i_private, buflen);
    if (ret == 0)
        return buflen;
    else
        return -EFAULT;
}

Kernel freezes happened and wailing of wolves were heard in the distance. Putting in just generic_readlink didn’t solve the problem either with kernel oops happening every time ls was run. Looking into the function definition of generic_readlink in the kernel source, I found it’s mentioned in the corresponding comment block that follow_link needs to be implemented for it to work. Looking into the code for ext2 gave me some idea about dealing with struct nameidata. Problem solved :)

Comments
1 Comment »
Categories
Coding, GNU/Linux
Tags
c, file systems, inode, kernel, libfs, linux, syscalls, vfs
Comments rss Comments rss
Trackback Trackback

Small rant on the FUSE API reference

anomit | December 15, 2009

I generally don’t rant on such well established open source projects simply because I’m not even remotely qualified to do so. But I’ll be making an exception this time. The bad cold I got more than a week back simply refuses to go away and poorly thought out documentation rules really raised my hackles this time.

Sample this from the struct fuse_operations documentation page:

int(* fuse_operations::write)(const char *, const char *, size_t, off_t, struct fuse_file_info *) 

Read the accompanying description. How is someone who is just starting off with FUSE supposed to know what the two char *’s are for? So off you go looking into source code provided by some tutorial and thereby waste at least 10 minutes in the process.

As it can be seen the docs have been generated by Doxygen. I know it’s “experimental” and all but is it really that difficult to write a bit more detailed comment on the function?

Comments
1 Comment »
Categories
Coding, GNU/Linux
Tags
c, file systems, fuse, linux
Comments rss Comments rss
Trackback Trackback

Kernel module debugging: a simple technique

anomit | November 4, 2009

Disclaimer: I have only started out with developing kernel modules and even novice would be an overstatement to describe my current skills. What follows is stuff I gathered from different sources while trying to debug a kernel oops due to a module: some googled, some from the LDD3 book which finally put together gives more or less a basic strategy to start with debugging a kernel module.

As I figured out from reading LDD3, you can use one of these tools to debug a module

  • plain ol’ gdb
  • kgdb
  • kdb

kgdb doesn’t really strike me as something I will be needing in the near future. I’m quite sure I won’t be taking the trouble to find another system to set up a debug session. But for all I know it might be invaluable to those involved in serious work.

kdb requires you to patch the kernel. I’ll admit I didn’t try this out of sheer laziness.

gdb should be a part of the arsenal of even a half-serious programmer and in my case, it was. There are just a few things that need to be in place before you start using it. First, you need the uncompressed kernel image, vmlinux (not vmlinuz). Second, you need to compile the kernel with some extra options to help you with debugging. This one is again from the LDD3 book, Chapter 4.

CONFIG_DEBUG_KERNEL* 
CONFIG_DEBUG_SLAB
CONFIG_DEBUG_PAGEALLOC
CONFIG_DEBUG_SPINLOCK
CONFIG_DEBUG_SPINLOCK_SLEEP
CONFIG_INIT_DEBUG*
CONFIG_DEBUG_INFO*
CONFIG_MAGIC_SYSRQ
CONFIG_DEBUG_STACKOVERFLOW
CONFIG_DEBUG_STACK_USAGE
CONFIG_KALLSYMS*
CONFIG_IKCONFIG*
CONFIG_IKCONFIG_PROC*
CONFIG_ACPI_DEBUG
CONFIG_DEBUG_DRIVER
CONFIG_SCSI_CONSTANTS
CONFIG_INPUT_EVBUG
CONFIG_PROFILING*

It’s not that all of these are absolutely necessary to get any kind of debugging work done but you never know what kind of oops/kernel panic you might be facing. Still I have starred the ones that I feel *must* be enabled. But don’t go by my words, compile and recompile to find out the truth ;)

With all the yak mowing out of the way, you can finally start debugging the module with your freshly recompiled kernel.

Start the debugger with
#gdb /usr/src/linux/vmlinux /proc/kcore

But gdb doesn’t yet know where to find the module’s code and data sections. You can either do it manually by going into /sys/module/module_name/sections, cat-ing the values of .text, .data and .bss and then this command at the gdb prompt

(gdb)add-symbol-file /path/to/module 0xd081d000 \  # .text
-s .data 0xd08232c0 \
-s .bss 0xd0823e20

or this shell script will output the whole command for you:

#!/bin/bash
#
# gdbline module image
#
# Outputs an add-symbol-file line suitable for pasting into gdb to examine
# a loaded module.
#
cd /sys/module/$1/sections
echo -n add-symbol-file $2 `/bin/cat .text`

for section in .[a-z]* *; do
    if [ $section != ".text" ]; then
	echo  " \\"
	echo -n "	-s" $section `/bin/cat $section`
    fi
done
echo

This information is again thanks to the LDD3 author Corbet, from this article. What would I have done without his book and articles?!

The module I was trying to debug was causing an oops due to null pointer dereferencing, which actually has been the source of quite a few vulnerabilities in the mainline kernel source. The following is what it looked like (got it from dmesg)

[27570.020736] BUG: unable to handle kernel NULL pointer dereference at 00000018
[27570.020747] IP: [<e07b3c31>] :plan9_net:socknet_connect+0xd1/0x110
[27570.020760] *pde = 00000000
[27570.020767] Oops: 0000 [#1] SMP
[snip]
[27570.020939]
[27570.020945] Pid: 8622, comm: bash Tainted: P          (2.6.27-14-generic #1)
[27570.020951] EIP: 0060:[<e07b3c31>] EFLAGS: 00010296 CPU: 0
[27570.020960] EIP is at socknet_connect+0xd1/0x110 [plan9_net]
[27570.020966] EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00f60000
[27570.020971] ESI: de4182a8 EDI: 00000002 EBP: dddedf20 ESP: dddedef4
[27570.020977]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[27570.020983] Process bash (pid: 8622, ti=dddec000 task=c198f110 task.ti=dddec000)
[27570.020988] Stack: 00000000 de4189f0 de418308 17000002 0101a8c0 d0f85494 d0f85493 de4ed200
[27570.021004]        dda077c0 d0f85494 d0f85493 dddedf54 e07b37d5 00000000 dd4fd340 dda077c0
[27570.021019]        0000000e dd4fd340 de4182a8 d0f85493 dd4fd340 d0f85480 09582808 caaa4540
[27570.021035] Call Trace:
[27570.021041]  [<e07b37d5>] ? tcp_n_ctl_process+0x145/0x170 [plan9_net]
[27570.021053]  [<e07b3505>] ? slashnet_write_file+0x185/0x190 [plan9_net]
[27570.021070]  [<c01b2c70>] ? vfs_write+0xa0/0x110
[27570.021081]  [<e07b3380>] ? slashnet_write_file+0x0/0x190 [plan9_net]
[27570.021092]  [<c01b2db2>] ? sys_write+0x42/0x70
[27570.021101]  [<c0103f7b>] ? sysenter_do_call+0x12/0x2f
[27570.021110]  [<c0380000>] ? __down_killable+0x60/0xd0
[27570.021121]  =======================
[27570.021124] Code: 7b e0 bb ff ff ff ff e8 cc ab bc df eb 8f 8b 58 0c 8d 55 e0 b9 10 00 00 00 c7 04 24 00 00 00 00 ff 53 10 89 c3 8b 45 f0 8b 40 14 <8b> 40 18 c7 04 24 5c 3e 7b e0 89 44 24 04 e8 9a ab bc df 85 db
[27570.021211] EIP: [<e07b3c31>] socknet_connect+0xd1/0x110 [plan9_net] SS:ESP 0068:dddedef4
[27570.021235] ---[ end trace 1d54537d6fc8b3bc ]---

Phew that’s a lot of information! You get a dump of all the register values, the stacktrace, codetrace etc in an oops message. I’ve given a couple of links at the end that deal with all the information present. Refer to them for more details.

For now, we can see that something was executed in the socknet_connect section at an offset of 0xd1 which caused the null pointer dereference. We’re very close to finding out the errant piece of code now. Just do the following in the gdb prompt to home in right on the culprit statement :
(gdb)list *socknet_connect+0xd1

..and we are done! Pretty simple and basic, wasn’t it?

These two links are really good for pointers on how to look for the necessary information in an oops message

  • Re: what’s an OOPS by John Bradford from LKML
  • A very detailed oops report analysis that’ll really help you with ‘how to get from bug report to the source of bug’

I’ve been also trying to use the offset information with the disassembled module to figure out which part of the source code it might actually correspond to. I haven’t met with much success though.

Comments
No Comments »
Categories
Coding, GNU/Linux
Tags
c, Coding, kernel, linux, module
Comments rss Comments rss
Trackback Trackback

Vim/Cscope quickie

anomit | October 10, 2009

You can find a detailed tutorial here. This will only help you get started real soon with cscope.

First of all, you need to have cscope support compiled with your vim installation. Packages for vim offered by Ubuntu and Fedora already have it. In Gentoo, I had to add app-editors/vim cscope in /etc/portage/package.use for the cscope USE flag to be enabled.

With all the yak shaving out of the way, get down straight to business. Move over the cursor to any symbol or identifier in your source code and try out these:

  • Ctrl-\-s: Search results are loaded in the same window
  • Ctrl-Space-s: Search results are loaded in a horizontally split window
  • Ctrl-Space(twice)-s: Search results are loaded in a vertically split window

You should also check out a link within that tutorial that will help you to use cscope efficiently with a large codebase such as the Linux kernel :)

PS: I forgot to mention that you need to install the plugin for these keymappings from the sourceforge project whose tutorial link I have already posted.

Comments
1 Comment »
Categories
Coding, GNU/Linux
Tags
c, cscope, vim
Comments rss Comments rss
Trackback Trackback

PyCon India or Code Jam?

anomit | September 15, 2009

That is something which has been playing at the back of my mind for the past couple of days. not this shit again

Google Code Jam Round 2 is on 26th Sept, 21:30 IST, right on the day PyCon India starts. In theory I can attend the 1st day of PyCon and compete in GCJ as well at night. But there are a few small things that have been bugging me like the possibility of being dead tired at the end of the day, lack of a decent internet connection etc.

Advancing to round 3 of GCJ would require me to be placed within the top 500 of the 3000 competitors. Effectively I have about 5-6 days in total to prepare for it, excluding the useless exams in between and a ~500 rank isn’t asking for too much. This is one of the reasons that I’m inclined to stay back instead of taking on a 10 hour long overnight bus journey coupled with running around the city for a whole day.

If you are reading this, put in a few words of wisdom (considering you have them at your disposal).

Comments
6 Comments »
Categories
Coding, My Life
Tags
code jam, PyCon, python
Comments rss Comments rss
Trackback Trackback

Param and TechTatva 2009

anomit | September 4, 2009

TechTatva is the technical fest held at my college and this year it starts from 8th September, that is a few days from now. No use of rambling on about its greatness and the value it brings to an otherwise morbid campus life.

Param is the category under TechTatva broadly relating to the computer science and IT oriented events. I’m in charge of handling it this time. Since last year, we have been trying to infuse some freshness into this category with the help of new events. Last year it was Mobivision which was about application development on two mobile platforms: Android and Symbian S60. The event was a huge hit in the college with 200+ people turning up for the workshops on PyS60 and Android development.

This time we have introduced an algorithm intensive event, Algosm which will be hosted and evaluated online by Codechef. This is the first time such an event will be a part of Param like Mobivision was last year. We hope this will help a lot of students in our college get acquainted with the overall concept of online programming contests.

Outstation participants are also welcome (obviously because it’s an online event :) . Do check out the Algosm page on the TechTatva website for more details.

Comments
4 Comments »
Categories
Coding, My Life
Tags
college
Comments rss Comments rss
Trackback Trackback

Proof of suckage, early 2008

anomit | September 2, 2009
I applied for the Etherboot project in the 2008 edition of GSoC. Looking back it doesn’t look like a wise decision at all since my C skills sucked hairy camel balls back then.

Today I was reading through some shellcode and buffer overflow attack basics and just about an hour ago from now I happened to remember this certain question that a couple of developers from the Etherboot project asked me during the IRC screening of potential candidates. Right now it took me less than 5 minutes to come up with a hopefully correct solution. Back then I was absolutely at a loss how to even begin coding the problem. The logic of the problem is really simple but I had no idea how to put it down in concrete code. Take a look at the problem and the solution below and laugh at me.

/**
* Search memory for a 32-bit pattern match on a 32-bit boundary
*
* @v start             Start address of region to search
* @v len      Length of region, in bytes
* @v pattern           Pattern to search for
* @v mask              Mask of which bits in the pattern we care about
* @ret found           First address at which pattern is found
*
*
* The mask is used to indicate that we care about only part of the
* pattern matching.  For example, suppose we wanted to search the
* region for words of the form
*
*   0xabcdXXXX
*
* where X indicates that we don't care about that digit (i.e. that we
* would want to match on 0xabcd0000, or 0xabcd1234, or 0xabcdffff,
* etc.).  We would then call memsearch() as
*
*   memsearch ( start, len, 0xabcd0000, 0xffff0000 );
*/
#include <stdint.h>
#include <stdio.h>

uint32_t *memsearch ( uint32_t *start, size_t len, uint32_t pattern, uint32_t mask )
{
    uint32_t s = (uint32_t)start;
    while(len--)
    {
        s++;
        if ( (s & mask) == (pattern & mask) )
            return (uint32_t *)s;
    }
    return NULL;
}

int main()
{
    uint32_t result;
    printf("Result is: %x", (result=(uint32_t)memsearch ( 0x00000000, 4294967295u,\
0x000000df, 0x000000ff ))?result:0);
    return 0;
}
Comments
4 Comments »
Categories
Coding, GNU/Linux
Tags
c, etherboot, pointers
Comments rss Comments rss
Trackback Trackback

Loki: my attempt at creating an online judge

anomit | July 5, 2009

This would be my first major open source contribution out in the wild. If you don’t know what an online judge means here, consider paying a visit to SPOJ or UVa Online Judge to get an idea.

I wrote a mammoth README file that covers all the issues and features of this system and won’t waste much space here copying the same thing. It’s hosted at github (Rohit, happy now? ;) ). Head over to the main tree of the project.. Right now the source needs ugly hackery to get it running on another computer but I promise to correct that in the next few releases. Till then I believe the README you will help you out.

Oh, the code is licensed GPLv2 by the way. But you already knew it, didn’t you? :P

Comments
6 Comments »
Categories
Coding, GNU/Linux
Tags
c, linux, loki, signals, system calls
Comments rss Comments rss
Trackback Trackback

Moving to Google Code

anomit | June 27, 2009

I have decided to move some of my personal projects to Google Code so that it’d give me some impetus and make me get off my lazy ass and actually put some real effort into the things I like.

Er, not really. It’s more of a selfish decision. Right now I want other people to read and criticize my code so that I’d know where I’m going wrong and correct myself. Nothing’s more dangerous than complacency.

As a starter, I have uploaded the tiny mpd now playing plugin I wrote for xchat about an year ago when I was learning socket programming in python. Showing the current playing track was too easy. Right now I want it to become more like an interface to mpd that provides basic controls sitting in the comfort of xchat.

I named the project mpd-xchat

PS: Just after I had uploaded the code, I found a project by the name of xchat-mpd :P

I hope to add one more project in the next few days :)

Comments
5 Comments »
Categories
Coding, GNU/Linux
Tags
mpd, python, xhcat
Comments rss Comments rss
Trackback Trackback

Resource limits – Part II (hard limits)

anomit | June 26, 2009

So much for late night coding. Yesterday I missed out on a very important part about setting the hard limits on resources. But you need to have a superuser process to achieve this. Referring to the code of exec.c in the previous post, put in this after line no. 18.

limit.rlim_max =1;

This is actually the hard limit and as the man page says, it acts as a ceiling for the soft limit i.e. rlim_cur. The advantage being that if the process exceeds the limit and yet continues running such as by handling the SIGXCPU signal in the previous example, this time a SIGKILL will be issued which would force it to terminate.

Now compile exec.c as usual and run it as root. You won’t see any output as we set the hard limit equal to the soft limit.

  • Soft limit reached, SIGXCPU sent
  • Process tries to handle it
  • At the same time the hard limit i.e. the ceiling for soft limit is reached too. So a SIGKILL is sent and the process terminates before any output.

References


setrlimit(2) man page

Comments
No Comments »
Categories
Coding, GNU/Linux
Tags
ipc, linux, syscall
Comments rss Comments rss
Trackback Trackback

« Previous Entries

What’s in

  • Symlinks in a libfs virtual file system: The Pains
  • Small rant on the FUSE API reference
  • Kernel module debugging: a simple technique
  • Vim/Cscope quickie
  • PyCon India or Code Jam?

Blogroll

  • Akshay Kothari
  • Ankur Shrivastav (OS)
  • Ankur Sinha
  • Harsh J
  • Hullap
  • LUG manipal
  • Swap

Tags

aircrack airfail airtel assembly blues build c Coding college country cryptography dean faculty file systems fuckery gnuplot hacking India kernel linux mangalore manipal mpd music NASM plugin plugins politicians pub culture python rant rock sam scheduler simulation SSFNet stupidity supernatural suppression syscall syscalls unix vim xchat xml

Archives

  • January 2010
  • December 2009
  • November 2009
  • October 2009
  • September 2009
  • July 2009
  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • January 2009
  • November 2008
  • September 2008
  • August 2008
  • July 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • February 2008
  • January 2008
  • October 2007
  • September 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007
  • March 2007

License

Creative Commons License
This work by Anomit Ghosh is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 2.5 India License.
rss Comments rss valid xhtml 1.1 design by jide powered by Wordpress get firefox