From 4aca87515a5083ae0e31ce3177189fd43b6d05ac Mon Sep 17 00:00:00 2001 From: Andreas Baumann Date: Sat, 3 Jan 2015 13:58:15 +0100 Subject: patch to Vanilla Tomato 1.28 --- .../src/router/busybox/docs/busybox.net/FAQ.html | 1146 ++++++++++++++++++++ 1 file changed, 1146 insertions(+) create mode 100644 release/src/router/busybox/docs/busybox.net/FAQ.html (limited to 'release/src/router/busybox/docs/busybox.net/FAQ.html') diff --git a/release/src/router/busybox/docs/busybox.net/FAQ.html b/release/src/router/busybox/docs/busybox.net/FAQ.html new file mode 100644 index 00000000..7ed1394c --- /dev/null +++ b/release/src/router/busybox/docs/busybox.net/FAQ.html @@ -0,0 +1,1146 @@ + + +

Frequently Asked Questions

+ +This is a collection of some of the more frequently asked questions +about BusyBox. Some of the questions even have answers. If you +have additions to this FAQ document, we would love to add them, + +

General questions

+
    +
  1. How can I get started using BusyBox?
  2. +
  3. How do I configure busybox?
  4. +
  5. How do I build BusyBox with a cross-compiler?
  6. +
  7. How do I build a BusyBox-based system?
  8. +
  9. Which Linux kernel versions are supported?
  10. +
  11. Which architectures does BusyBox run on?
  12. +
  13. Which C libraries are supported?
  14. +
  15. Can I include BusyBox as part of the software on my device?
  16. +
  17. Where can I find other small utilities since busybox does not include the features I want?
  18. +
  19. I demand that you to add <favorite feature> right now! How come you don't answer all my questions on the mailing list instantly? I demand that you help me with all of my problems Right Now!
  20. +
  21. I need help with BusyBox! What should I do?
  22. +
  23. I need you to add <favorite feature>! Are the BusyBox developers willing to be paid in order to fix bugs or add in <favorite feature>? Are you willing to provide support contracts?
  24. +
+ +

Troubleshooting

+
    +
  1. I think I found a bug in BusyBox! What should I do?!
  2. +
  3. I'm using an ancient version from the dawn of time and something's broken. Can you backport fixes for free?
  4. +
  5. Busybox init isn't working!
  6. +
  7. I can't configure busybox on my system.
  8. +
  9. Why do I keep getting "sh: can't access tty; job control turned off" errors? Why doesn't Control-C work within my shell?
  10. +
+ +

Misc. questions

+
    +
  1. How do I change the time zone in busybox?
  2. +
+ +

Programming questions

+
    +
  1. What are the goals of busybox?
  2. +
  3. What is the design of busybox?
  4. +
  5. How is the source code organized? + +
  6. +
  7. I want to make busybox even smaller, how do I go about it?
  8. +
  9. Adding an applet to busybox
  10. +
  11. What standards does busybox adhere to?
  12. +
  13. Portability.
  14. +
  15. Tips and tricks. + +
  16. +
  17. Who are the BusyBox developers?
  18. +
+ + +
+

General questions

+ +
+

How can I get started using BusyBox?

+ +

If you just want to try out busybox without installing it, download the + tarball, extract it, run "make defconfig", and then run "make". +

+

+ This will create a busybox binary with almost all features enabled. To try + out a busybox applet, type "./busybox [appletname] [options]", for + example "./busybox ls -l" or "./busybox cat LICENSE". Type "./busybox" + to see a command list, and "busybox appletname --help" to see a brief + usage message for a given applet. +

+

+ BusyBox uses the name it was invoked under to determine which applet is + being invoked. (Try "mv busybox ls" and then "./ls -l".) Installing + busybox consists of creating symlinks (or hardlinks) to the busybox + binary for each applet in busybox, and making sure these links are in + the shell's command $PATH. The special applet name "busybox" (or with + any optional suffix, such as "busybox-static") uses the first argument + to determine which applet to run, as shown above. +

+

+ BusyBox also has a feature called the + "standalone shell", where the busybox + shell runs any built-in applets before checking the command path. This + feature is also enabled by "make allyesconfig", and to try it out run + the command line "PATH= ./busybox ash". This will blank your command path + and run busybox as your command shell, so the only commands it can find + (without an explicit path such as /bin/ls) are the built-in busybox ones. + This is another good way to see what's built into busybox. + Note that the standalone shell requires CONFIG_BUSYBOX_EXEC_PATH + to be set appropriately, depending on whether or not /proc/self/exe is + available or not. If you do not have /proc, then point that config option + to the location of your busybox binary, usually /bin/busybox. + (So if you set it to /proc/self/exe, and happen to be able to chroot into + your rootfs, you must mount /proc beforehand.) +

+

+ A typical indication that you set CONFIG_BUSYBOX_EXEC_PATH to proc but + forgot to mount proc is: +

+$ /bin/echo $PATH
+/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/bin/X11
+$ echo $PATH
+/bin/sh: echo: not found
+
+ +
+

How do I configure busybox?

+ +

Busybox is configured similarly to the linux kernel. Create a default + configuration and then run "make menuconfig" to modify it. The end + result is a .config file that tells the busybox build process what features + to include. So instead of "./configure; make; make install" the equivalent + busybox build would be "make defconfig; make; make install". +

+ +

Busybox configured with all features enabled is a little under a megabyte + dynamically linked on x86. To create a smaller busybox, configure it with + fewer features. Individual busybox applets cost anywhere from a few + hundred bytes to tens of kilobytes. Disable unneeded applets to save, + space, using menuconfig. +

+ +

The most important busybox configurators are:

+ + + +

Some other configuration options are:

+ + +

Menuconfig modifies your .config file through an interactive menu where you can enable or disable + busybox features, and get help about each feature. + +

+ To build a smaller busybox binary, run "make menuconfig" and disable the + features you don't need. (Or run "make allnoconfig" and then use + menuconfig to add just the features you need. Don't forget to recompile + with "make" once you've finished configuring.) +

+ +
+

How do I build BusyBox with a cross-compiler?

+ +

+ To build busybox with a cross-compiler, specify CROSS_COMPILE=<prefix>. +

+

+ CROSS_COMPILE specifies the prefix used for all executables used + during compilation. Only gcc and related binutils executables + are prefixed with $(CROSS_COMPILE) in the makefiles. + CROSS_COMPILE can be set on the command line: +

+
+   make CROSS_COMPILE=arm-linux-uclibcgnueabi-
+
+

+ Alternatively CROSS_COMPILE can be set in the environment. + Default value for CROSS_COMPILE is not to prefix executables. +

+

+ To store the cross-compiler in your .config, set the variable + CONFIG_CROSS_COMPILER_PREFIX accordingly in menuconfig or by + editing the .config file. +

+ +
+

How do I build a BusyBox-based system?

+ +

+ BusyBox is a package that replaces a dozen standard packages, but it is + not by itself a complete bootable system. Building an entire Linux + distribution from source is a bit beyond the scope of this FAQ, but it + understandably keeps cropping up on the mailing list, so here are some + pointers. +

+

+ Start by learning how to strip a working system down to the bare essentials + needed to run one or two commands, so you know what it is you actually + need. An excellent practical place to do + this is the Linux + BootDisk Howto, or for a more theoretical approach try + From + PowerUp to Bash Prompt. +

+

+ To learn how to build a working Linux system entirely from source code, + the place to go is the Linux + From Scratch project. They have an entire book of step-by-step + instructions you can + read online + or + download. + Be sure to check out the other sections of their main page, including + Beyond Linux From Scratch, Hardened Linux From Scratch, their Hints + directory, and their LiveCD project. (They also have mailing lists which + are better sources of answers to Linux-system building questions than + the busybox list.) +

+

+ If you want an automated yet customizable system builder which produces + a BusyBox and uClibc based system, try + buildroot, which is + another project by the maintainer of the uClibc (Erik Andersen). + Download the tarball, extract it, unset CC, make. + For more instructions, see the website. +

+ +
+

Which Linux kernel versions are supported?

+ +

+ Full functionality requires Linux 2.4.x or better. (Earlier versions may + still work, but are no longer regularly tested.) A large fraction of the + code should run on just about anything. While the current code is fairly + Linux specific, it should be fairly easy to port the majority of the code + to support, say, FreeBSD or Solaris, or Mac OS X, or even Windows (if you + are into that sort of thing). +

+ +
+

Which architectures does BusyBox run on?

+ +

+ BusyBox in general will build on any architecture supported by gcc. + Kernel module loading for 2.4 Linux kernels is currently + limited to ARM, CRIS, H8/300, x86, ia64, x86_64, m68k, MIPS, PowerPC, + S390, SH3/4/5, Sparc, v850e, and x86_64 for 2.4.x kernels. +

+

+ With 2.6.x kernels, module loading support should work on all architectures. +

+ +
+

Which C libraries are supported?

+ +

+ On Linux, BusyBox releases are tested against uClibc (0.9.27 or later) and + glibc (2.2 or later). Both should provide full functionality with busybox, + and if you find a bug we want to hear about it. +

+

+ Linux-libc5 is no longer maintained (and has no known advantages over + uClibc), dietlibc is known to have numerous unfixed bugs, and klibc is + missing too many features to build BusyBox. If you require a small C + library for Linux, the busybox developers recommend uClibc. +

+

+ Some BusyBox applets have been built and run under a combination + of newlib and libgloss (see + this thread). + This is still experimental, but may be supported in a future release. +

+ +
+

Can I include BusyBox as part of the software on my device?

+ +

+ Yes. As long as you fully comply + with the generous terms of the GPL BusyBox license you can ship BusyBox + as part of the software on your device. +

+ +
+

Where can I find other small utilities since busybox + does not include the features i want?

+ +

+ we maintain such a list on this site! +

+ +
+

I demand that you to add <favorite feature> right now! How come you don't answer all my questions on the mailing list instantly? I demand that you help me with all of my problems Right Now!

+ +

+ You have not paid us a single cent and yet you still have the product of + many years of our work. We are not your slaves! We work on BusyBox + because we find it useful and interesting. If you go off flaming us, we + will ignore you. + +


+

I need help with BusyBox! What should I do?

+ +

+ If you find that you need help with BusyBox, you can ask for help on the + BusyBox mailing list at busybox@busybox.net.

+ +

In addition to the mailing list, Erik Andersen (andersee), Manuel Nova + (mjn3), Rob Landley (landley), Mike Frysinger (SpanKY), + Bernhard Reutner-Fischer (blindvt), and other long-time BusyBox developers + are known to hang out on the uClibc IRC channel: #uclibc on + irc.freenode.net. There is a + web archive of + daily logs of the #uclibc IRC channel going back to 2002. +

+ +

+ Please do not send private email to Rob, Erik, Manuel, or the other + BusyBox contributors asking for private help unless you are planning on + paying for consulting services. +

+ +

+ When we answer questions on the BusyBox mailing list, it helps everyone + since people with similar problems in the future will be able to get help + by searching the mailing list archives. Private help is reserved as a paid + service. If you need to use private communication, or if you are serious + about getting timely assistance with BusyBox, you should seriously consider + paying for consulting services. +

+ +
+

I need you to add <favorite feature>! Are the BusyBox developers willing to be paid in order to fix bugs or add in <favorite feature>? Are you willing to provide support contracts?

+ +

+ Yes we are. The easy way to sponsor a new feature is to post an offer on + the mailing list to see who's interested. You can also email the project's + maintainer and ask them to recommend someone. +

+ +
+

Troubleshooting

+ +
+

I think I found a bug in BusyBox! What should I do?

+ +

+ If you simply need help with using or configuring BusyBox, please submit a + detailed description of your problem to the BusyBox mailing list at busybox@busybox.net. + Please do not send email to individual developers asking + for private help unless you are planning on paying for consulting services. + When we answer questions on the BusyBox mailing list, it helps everyone, + while private answers help only you... +

+ +

+ Bug reports and new feature patches sometimes get lost when posted to the + mailing list, because the developers of BusyBox are busy people and have + only so much they can keep in their brains at a time. You can post a + polite reminder after 2-3 days without offending anybody. If that doesn't + result in a solution, please use the + BusyBox Bug + and Patch Tracking System to submit a detailed explanation and we'll + get to it as soon as we can. +

+ +

+ Note that bugs entered into the bug system without being mentioned on the + mailing list first may languish there for months before anyone even notices + them. We generally go through the bug system when preparing for new + development releases, to see what fell through the cracks while we were + off writing new features. (It's a fast/unreliable vs slow/reliable thing. + Saves retransits, but the latency sucks.) +

+ +
+

I'm using an ancient version from the dawn of time and something's broken. Can you backport fixes for free?

+ +

Variants of this one get asked a lot.

+ +

The purpose of the BusyBox mailing list is to develop and improve BusyBox, +and we're happy to respond to our users' needs. But if you're coming to the +list for free tech support we're going to ask you to upgrade to a current +version before we try to diagnose your problem.

+ +

If you're building BusyBox 0.50 with uClibc 0.9.19 and gcc 1.27 there's a +fairly large chance that whatever problem you're seeing has already been fixed. +To get that fix, all you have to do is upgrade to a newer version. If you +don't at least _try_ that, you're wasting our time.

+ +

The volunteers are happy to fix any bugs you point out in the current +versions because doing so helps everybody and makes the project better. We +want to make the current version work for you. But diagnosing, debugging, and +backporting fixes to old versions isn't something we do for free, because it +doesn't help anybody but you. The cost of volunteer tech support is using a +reasonably current version of the project.

+ +

If you don't want to upgrade, you have the complete source code and thus +the ability to fix it yourself, or hire a consultant to do it for you. If you +got your version from a vendor who still supports the older version, they can +help you. But there are limits as to what the volunteers will feel obliged to +do for you.

+ +

As a rule of thumb, volunteers will generally answer polite questions about +a given version for about three years after its release before it's so old +we don't remember the answer off the top of our head. And if you want us to +put any _effort_ into tracking it down, we want you to put in a little effort +of your own by confirming it's still a problem with the current version. It's +also hard for us to fix a problem of yours if we can't reproduce it because +we don't have any systems running an environment that old.

+ +

A consultant will happily set up a special environment just to reproduce +your problem, and you can always ask on the list if any of the developers +have consulting rates.

+ +
+

Busybox init isn't working!

+ +

+ Init is the first program that runs, so it might be that no programs are + working on your new system because of a problem with your cross-compiler, + kernel, console settings, shared libraries, root filesystem... To rule all + that out, first build a statically linked version of the following "hello + world" program with your cross compiler toolchain: +

+
+#include <stdio.h>
+
+int main(int argc, char *argv)
+{
+  printf("Hello world!\n");
+  sleep(999999999);
+}
+
+ +

+ Now try to boot your device with an "init=" argument pointing to your + hello world program. Did you see the hello world message? Until you + do, don't bother messing with busybox init. +

+ +

+ Once you've got it working statically linked, try getting it to work + dynamically linked. Then read the FAQ entry How + do I build a BusyBox-based system?, and the + documentation for BusyBox + init. +

+ +
+

I can't configure busybox on my system.

+ +

+ Configuring Busybox depends on a recent version of sed. Older + distributions (Red Hat 7.2, Debian 3.0) may not come with a + usable version. Luckily BusyBox can use its own sed to configure itself, + although this leads to a bit of a chicken and egg problem. + You can work around this by hand-configuring busybox to build with just + sed, then putting that sed in your path to configure the rest of busybox + with, like so: +

+ +
+  tar xvjf sources/busybox-x.x.x.tar.bz2
+  cd busybox-x.x.x
+  make allnoconfig
+  make include/bb_config.h
+  echo "CONFIG_SED=y" >> .config
+  echo "#undef ENABLE_SED" >> include/bb_config.h
+  echo "#define ENABLE_SED 1" >> include/bb_config.h
+  make
+  mv busybox sed
+  export PATH=`pwd`:"$PATH"
+
+ +

Then you can run "make defconfig" or "make menuconfig" normally.

+ +
+

Why do I keep getting "sh: can't access tty; job control turned off" errors? Why doesn't Control-C work within my shell?

+ +

+ Job control will be turned off since your shell can not obtain a controlling + terminal. This typically happens when you run your shell on /dev/console. + The kernel will not provide a controlling terminal on the /dev/console + device. Your should run your shell on a normal tty such as tty1 or ttyS0 + and everything will work perfectly. If you REALLY want your shell + to run on /dev/console, then you can hack your kernel (if you are into that + sortof thing) by changing drivers/char/tty_io.c to change the lines where + it sets "noctty = 1;" to instead set it to "0". I recommend you instead + run your shell on a real console... +

+ +
+

Misc. questions

+ +
+

How do I change the time zone in busybox?

+ +

Busybox has nothing to do with the timezone. Please consult your libc +documentation. (http://google.com/search?q=uclibc+glibc+timezone).

+ +
+

Development

+ +
+

What are the goals of busybox?

+ +

Busybox aims to be the smallest and simplest correct implementation of the +standard Linux command line tools. First and foremost, this means the +smallest executable size we can manage. We also want to have the simplest +and cleanest implementation we can manage, be standards +compliant, minimize run-time memory usage (heap and stack), run fast, and +take over the world.

+ +
+

What is the design of busybox?

+ +

Busybox is like a swiss army knife: one thing with many functions. +The busybox executable can act like many different programs depending on +the name used to invoke it. Normal practice is to create a bunch of symlinks +pointing to the busybox binary, each of which triggers a different busybox +function. (See getting started in the +FAQ for more information on usage, and the +busybox documentation for a list of symlink names and what they do.) + +

The "one binary to rule them all" approach is primarily for size reasons: a +single multi-purpose executable is smaller then many small files could be. +This way busybox only has one set of ELF headers, it can easily share code +between different apps even when statically linked, it has better packing +efficiency by avoding gaps between files or compression dictionary resets, +and so on.

+ +

Work is underway on new options such as "make standalone" to build separate +binaries for each applet, and a "libbb.so" to make the busybox common code +available as a shared library. Neither is ready yet at the time of this +writing.

+ + + +
+

The applet directories

+ +

The directory "applets" contains the busybox startup code (applets.c and +busybox.c), and several subdirectories containing the code for the individual +applets.

+ +

Busybox execution starts with the main() function in applets/busybox.c, +which sets the global variable applet_name to argv[0] and calls +run_applet_and_exit() in applets/applets.c. That uses the applets[] array +(defined in include/busybox.h and filled out in include/applets.h) to +transfer control to the appropriate APPLET_main() function (such as +cat_main() or sed_main()). The individual applet takes it from there.

+ +

This is why calling busybox under a different name triggers different +functionality: main() looks up argv[0] in applets[] to get a function pointer +to APPLET_main().

+ +

Busybox applets may also be invoked through the multiplexor applet +"busybox" (see busybox_main() in libbb/appletlib.c), and through the +standalone shell (grep for STANDALONE_SHELL in applets/shell/*.c). +See getting started in the +FAQ for more information on these alternate usage mechanisms, which are +just different ways to reach the relevant APPLET_main() function.

+ +

The applet subdirectories (archival, console-tools, coreutils, +debianutils, e2fsprogs, editors, findutils, init, loginutils, miscutils, +modutils, networking, procps, shell, sysklogd, and util-linux) correspond +to the configuration sub-menus in menuconfig. Each subdirectory contains the +code to implement the applets in that sub-menu, as well as a Config.in +file defining that configuration sub-menu (with dependencies and help text +for each applet), and the makefile segment (Makefile.in) for that +subdirectory.

+ +

The run-time --help is stored in usage_messages[], which is initialized at +the start of applets/applets.c and gets its help text from usage.h. During the +build this help text is also used to generate the BusyBox documentation (in +html, txt, and man page formats) in the docs directory. See +adding an applet to busybox for more +information.

+ +
+

libbb

+ +

Most non-setup code shared between busybox applets lives in the libbb +directory. It's a mess that evolved over the years without much auditing +or cleanup. For anybody looking for a great project to break into busybox +development with, documenting libbb would be both incredibly useful and good +experience.

+ +

Common themes in libbb include allocation functions that test +for failure and abort the program with an error message so the caller doesn't +have to test the return value (xmalloc(), xstrdup(), etc), wrapped versions +of open(), close(), read(), and write() that test for their own failures +and/or retry automatically, linked list management functions (llist.c), +command line argument parsing (getopt32.c), and a whole lot more.

+ +
+

I want to make busybox even smaller, how do I go about it?

+ +

+ To conserve bytes it's good to know where they're being used, and the + size of the final executable isn't always a reliable indicator of + the size of the components (since various structures are rounded up, + so a small change may not even be visible by itself, but many small + savings add up). +

+ +

The busybox Makefile builds two versions of busybox, one of which + (busybox_unstripped) has extra information that various analysis tools + can use. (This has nothing to do with CONFIG_DEBUG, leave that off + when trying to optimize for size.) +

+ +

The "make bloatcheck" option uses Matt Mackall's bloat-o-meter + script to compare two versions of busybox (busybox_unstripped vs + busybox_old), and report which symbols changed size and by how much. + To use it, first build a base version with "make baseline". + (This creates busybox_old, which should have the original sizes for + comparison purposes.) Then build the new version with your changes + and run "make bloatcheck" to see the size differences from the old + version. +

+

+ The first line of output has totals: how many symbols were added or + removed, how many symbols grew or shrank, the number of bytes added + and number of bytes removed by these changes, and finally the total + number of bytes difference between the two files. The remaining + lines show each individual symbol, the old and new sizes, and the + increase or decrease in size (which results are sorted by). +

+

+ The "make sizes" option produces raw symbol size information for + busybox_unstripped. This is the output from the "nm --size-sort" + command (see "man nm" for more information), and is the information + bloat-o-meter parses to produce the comparison report above. For + defconfig, this is a good way to find the largest symbols in the tree + (which is a good place to start when trying to shrink the code). To + take a closer look at individual applets, configure busybox with just + one applet (run "make allnoconfig" and then switch on a single applet + with menuconfig), and then use "make sizes" to see the size of that + applet's components. +

+

+ The "showasm" command (in the scripts directory) produces an assembly + dump of a function, providing a closer look at what changed. Try + "scripts/showasm busybox_unstripped" to list available symbols, and + "scripts/showasm busybox_unstripped symbolname" to see the assembly + for a sepecific symbol. +

+ +
+

Adding an applet to busybox

+ +

To add a new applet to busybox, first pick a name for the applet and +a corresponding CONFIG_NAME. Then do this:

+ + + +
+

What standards does busybox adhere to?

+ +

The standard we're paying attention to is the "Shell and Utilities" +portion of the Open +Group Base Standards (also known as the Single Unix Specification version +3 or SUSv3). Note that paying attention isn't necessarily the same thing as +following it.

+ +

SUSv3 doesn't even mention things like init, mount, tar, or losetup, nor +commonly used options like echo's '-e' and '-n', or sed's '-i'. Busybox is +driven by what real users actually need, not the fact the standard believes +we should implement ed or sccs. For size reasons, we're unlikely to include +much internationalization support beyond UTF-8, and on top of all that, our +configuration menu lets developers chop out features to produce smaller but +very non-standard utilities.

+ +

Also, Busybox is aimed primarily at Linux. Unix standards are interesting +because Linux tries to adhere to them, but portability to dozens of platforms +is only interesting in terms of offering a restricted feature set that works +everywhere, not growing dozens of platform-specific extensions. Busybox +should be portable to all hardware platforms Linux supports, and any other +similar operating systems that are easy to do and won't require much +maintenance.

+ +

In practice, standards compliance tends to be a clean-up step once an +applet is otherwise finished. When polishing and testing a busybox applet, +we ensure we have at least the option of full standards compliance, or else +document where we (intentionally) fall short.

+ +
+

Portability.

+ +

Busybox is a Linux project, but that doesn't mean we don't have to worry +about portability. First of all, there are different hardware platforms, +different C library implementations, different versions of the kernel and +build toolchain... The file "include/platform.h" exists to centralize and +encapsulate various platform-specific things in one place, so most busybox +code doesn't have to care where it's running.

+ +

To start with, Linux runs on dozens of hardware platforms. We try to test +each release on x86, x86-64, arm, power pc, and mips. (Since qemu can handle +all of these, this isn't that hard.) This means we have to care about a number +of portability issues like endianness, word size, and alignment, all of which +belong in platform.h. That header handles conditional #includes and gives +us macros we can use in the rest of our code. At some point in the future +we might grow a platform.c, possibly even a platform subdirectory. As long +as the applets themselves don't have to care.

+ +

On a related note, we made the "default signedness of char varies" problem +go away by feeding the compiler -funsigned-char. This gives us consistent +behavior on all platforms, and defaults to 8-bit clean text processing (which +gets us halfway to UTF-8 support). NOMMU support is less easily separated +(see the tips section later in this document), but we're working on it.

+ +

Another type of portability is build environments: we unapologetically use +a number of gcc and glibc extensions (as does the Linux kernel), but these have +been picked up by packages like uClibc, TCC, and Intel's C Compiler. As for +gcc, we take advantage of newer compiler optimizations to get the smallest +possible size, but we also regression test against an older build environment +using the Red Hat 9 image at "http://busybox.net/downloads/qemu". This has a +2.4 kernel, gcc 3.2, make 3.79.1, and glibc 2.3, and is the oldest +build/deployment environment we still put any effort into maintaining. (If +anyone takes an interest in older kernels you're welcome to submit patches, +but the effort would probably be better spent +trimming +down the 2.6 kernel.) Older gcc versions than that are uninteresting since +we now use c99 features, although +tcc might be worth a +look.

+ +

We also test busybox against the current release of uClibc. Older versions +of uClibc aren't very interesting (they were buggy, and uClibc wasn't really +usable as a general-purpose C library before version 0.9.26 anyway).

+ +

Other unix implementations are mostly uninteresting, since Linux binaries +have become the new standard for portable Unix programs. Specifically, +the ubiquity of Linux was cited as the main reason the Intel Binary +Compatability Standard 2 died, by the standards group organized to name a +successor to ibcs2: the 86open +project. That project disbanded in 1999 with the endorsement of an +existing standard: Linux ELF binaries. Since then, the major players at the +time (such as AIX, Solaris, and +FreeBSD) +have all either grown Linux support or folded.

+ +

The major exceptions are newcomer MacOS X, some embedded environments +(such as newlib+libgloss) which provide a posix environment but not a full +Linux environment, and environments like Cygwin that provide only partial Linux +emulation. Also, some embedded Linux systems run a Linux kernel but amputate +things like the /proc directory to save space.

+ +

Supporting these systems is largely a question of providing a clean subset +of BusyBox's functionality -- whichever applets can easily be made to +work in that environment. Annotating the configuration system to +indicate which applets require which prerequisites (such as procfs) is +also welcome. Other efforts to support these systems (swapping #include +files to build in different environments, adding adapter code to platform.h, +adding more extensive special-case supporting infrastructure such as mount's +legacy mtab support) are handled on a case-by-case basis. Support that can be +cleanly hidden in platform.h is reasonably attractive, and failing that +support that can be cleanly separated into a separate conditionally compiled +file is at least worth a look. Special-case code in the body of an applet is +something we're trying to avoid.

+ +
+

Programming tips and tricks.

+ +

Various things busybox uses that aren't particularly well documented +elsewhere.

+ +
+

Encrypted Passwords

+ +

Password fields in /etc/passwd and /etc/shadow are in a special format. +If the first character isn't '$', then it's an old DES style password. If +the first character is '$' then the password is actually three fields +separated by '$' characters:

+
+  $type$salt$encrypted_password
+
+ +

The "type" indicates which encryption algorithm to use: 1 for MD5 and 2 for SHA1.

+ +

The "salt" is a bunch of ramdom characters (generally 8) the encryption +algorithm uses to perturb the password in a known and reproducible way (such +as by appending the random data to the unencrypted password, or combining +them with exclusive or). Salt is randomly generated when setting a password, +and then the same salt value is re-used when checking the password. (Salt is +thus stored unencrypted.)

+ +

The advantage of using salt is that the same cleartext password encrypted +with a different salt value produces a different encrypted value. +If each encrypted password uses a different salt value, an attacker is forced +to do the cryptographic math all over again for each password they want to +check. Without salt, they could simply produce a big dictionary of commonly +used passwords ahead of time, and look up each password in a stolen password +file to see if it's a known value. (Even if there are billions of possible +passwords in the dictionary, checking each one is just a binary search against +a file only a few gigabytes long.) With salt they can't even tell if two +different users share the same password without guessing what that password +is and decrypting it. They also can't precompute the attack dictionary for +a specific password until they know what the salt value is.

+ +

The third field is the encrypted password (plus the salt). For md5 this +is 22 bytes.

+ +

The busybox function to handle all this is pw_encrypt(clear, salt) in +"libbb/pw_encrypt.c". The first argument is the clear text password to be +encrypted, and the second is a string in "$type$salt$password" format, from +which the "type" and "salt" fields will be extracted to produce an encrypted +value. (Only the first two fields are needed, the third $ is equivalent to +the end of the string.) The return value is an encrypted password in +/etc/passwd format, with all three $ separated fields. It's stored in +a static buffer, 128 bytes long.

+ +

So when checking an existing password, if pw_encrypt(text, +old_encrypted_password) returns a string that compares identical to +old_encrypted_password, you've got the right password. When setting a new +password, generate a random 8 character salt string, put it in the right +format with sprintf(buffer, "$%c$%s", type, salt), and feed buffer as the +second argument to pw_encrypt(text,buffer).

+ +
+

Fork and vfork

+ +

On systems that haven't got a Memory Management Unit, fork() is unreasonably +expensive to implement (and sometimes even impossible), so a less capable +function called vfork() is used instead. (Using vfork() on a system with an +MMU is like pounding a nail with a wrench. Not the best tool for the job, but +it works.)

+ +

Busybox hides the difference between fork() and vfork() in +libbb/bb_fork_exec.c. If you ever want to fork and exec, use bb_fork_exec() +(which returns a pid and takes the same arguments as execve(), although in +this case envp can be NULL) and don't worry about it. This description is +here in case you want to know why that does what it does.

+ +

Implementing fork() depends on having a Memory Management Unit. With an +MMU then you can simply set up a second set of page tables and share the +physical memory via copy-on-write. So a fork() followed quickly by exec() +only copies a few pages of the parent's memory, just the ones it changes +before freeing them.

+ +

With a very primitive MMU (using a base pointer plus length instead of page +tables, which can provide virtual addresses and protect processes from each +other, but no copy on write) you can still implement fork. But it's +unreasonably expensive, because you have to copy all the parent process' +memory into the new process (which could easily be several megabytes per fork). +And you have to do this even though that memory gets freed again as soon as the +exec happens. (This is not just slow and a waste of space but causes memory +usage spikes that can easily cause the system to run out of memory.)

+ +

Without even a primitive MMU, you have no virtual addresses. Every process +can reach out and touch any other process' memory, because all pointers are to +physical addresses with no protection. Even if you copy a process' memory to +new physical addresses, all of its pointers point to the old objects in the +old process. (Searching through the new copy's memory for pointers and +redirect them to the new locations is not an easy problem.)

+ +

So with a primitive or missing MMU, fork() is just not a good idea.

+ +

In theory, vfork() is just a fork() that writeably shares the heap and stack +rather than copying it (so what one process writes the other one sees). In +practice, vfork() has to suspend the parent process until the child does exec, +at which point the parent wakes up and resumes by returning from the call to +vfork(). All modern kernel/libc combinations implement vfork() to put the +parent to sleep until the child does its exec. There's just no other way to +make it work: the parent has to know the child has done its exec() or exit() +before it's safe to return from the function it's in, so it has to block +until that happens. In fact without suspending the parent there's no way to +even store separate copies of the return value (the pid) from the vfork() call +itself: both assignments write into the same memory location.

+ +

One way to understand (and in fact implement) vfork() is this: imagine +the parent does a setjmp and then continues on (pretending to be the child) +until the exec() comes around, then the _exec_ does the actual fork, and the +parent does a longjmp back to the original vfork call and continues on from +there. (It thus becomes obvious why the child can't return, or modify +local variables it doesn't want the parent to see changed when it resumes.) + +

Note a common mistake: the need for vfork doesn't mean you can't have two +processes running at the same time. It means you can't have two processes +sharing the same memory without stomping all over each other. As soon as +the child calls exec(), the parent resumes.

+ +

If the child's attempt to call exec() fails, the child should call _exit() +rather than a normal exit(). This avoids any atexit() code that might confuse +the parent. (The parent should never call _exit(), only a vforked child that +failed to exec.)

+ +

(Now in theory, a nommu system could just copy the _stack_ when it forks +(which presumably is much shorter than the heap), and leave the heap shared. +Even with no MMU at all +In practice, you've just wound up in a multi-threaded situation and you can't +do a malloc() or free() on your heap without freeing the other process' memory +(and if you don't have the proper locking for being threaded, corrupting the +heap if both of you try to do it at the same time and wind up stomping on +each other while traversing the free memory lists). The thing about vfork is +that it's a big red flag warning "there be dragons here" rather than +something subtle and thus even more dangerous.)

+ +
+

Short reads and writes

+ +

Busybox has special functions, bb_full_read() and bb_full_write(), to +check that all the data we asked for got read or written. Is this a real +world consideration? Try the following:

+ +
while true; do echo hello; sleep 1; done | tee out.txt
+ +

If tee is implemented with bb_full_read(), tee doesn't display output +in real time but blocks until its entire input buffer (generally a couple +kilobytes) is read, then displays it all at once. In that case, we _want_ +the short read, for user interface reasons. (Note that read() should never +return 0 unless it has hit the end of input, and an attempt to write 0 +bytes should be ignored by the OS.)

+ +

As for short writes, play around with two processes piping data to each +other on the command line (cat bigfile | gzip > out.gz) and suspend and +resume a few times (ctrl-z to suspend, "fg" to resume). The writer can +experience short writes, which are especially dangerous because if you don't +notice them you'll discard data. They can also happen when a system is under +load and a fast process is piping to a slower one. (Such as an xterm waiting +on x11 when the scheduler decides X is being a CPU hog with all that +text console scrolling...)

+ +

So will data always be read from the far end of a pipe at the +same chunk sizes it was written in? Nope. Don't rely on that. For one +counterexample, see rfc 896 +for Nagle's algorithm, which waits a fraction of a second or so before +sending out small amounts of data through a TCP/IP connection in case more +data comes in that can be merged into the same packet. (In case you were +wondering why action games that use TCP/IP set TCP_NODELAY to lower the latency +on their their sockets, now you know.)

+ +
+

Memory used by relocatable code, PIC, and static linking.

+ +

The downside of standard dynamic linking is that it results in self-modifying +code. Although each executable's pages are mmaped() into a process' address +space from the executable file and are thus naturally shared between processes +out of the page cache, the library loader (ld-linux.so.2 or ld-uClibc.so.0) +writes to these pages to supply addresses for relocatable symbols. This +dirties the pages, triggering copy-on-write allocation of new memory for each +processes' dirtied pages.

+ +

One solution to this is Position Independent Code (PIC), a way of linking +a file so all the relocations are grouped together. This dirties fewer +pages (often just a single page) for each process' relocations. The down +side is this results in larger executables, which take up more space on disk +(and a correspondingly larger space in memory). But when many copies of the +same program are running, PIC dynamic linking trades a larger disk footprint +for a smaller memory footprint, by sharing more pages.

+ +

A third solution is static linking. A statically linked program has no +relocations, and thus the entire executable is shared between all running +instances. This tends to have a significantly larger disk footprint, but +on a system with only one or two executables, shared libraries aren't much +of a win anyway.

+ +

You can tell the glibc linker to display debugging information about its +relocations with the environment variable "LD_DEBUG". Try +"LD_DEBUG=help /bin/true" for a list of commands. Learning to interpret +"LD_DEBUG=statistics cat /proc/self/statm" could be interesting.

+ +

For more on this topic, here's Rich Felker:

+
+

Dynamic linking (without fixed load addresses) fundamentally requires +at least one dirty page per dso that uses symbols. Making calls (but +never taking the address explicitly) to functions within the same dso +does not require a dirty page by itself, but will with ELF unless you +use -Bsymbolic or hidden symbols when linking.

+ +

ELF uses significant additional stack space for the kernel to pass all +the ELF data structures to the newly created process image. These are +located above the argument list and environment. This normally adds 1 +dirty page to the process size.

+ +

The ELF dynamic linker has its own data segment, adding one or more +dirty pages. I believe it also performs relocations on itself.

+ +

The ELF dynamic linker makes significant dynamic allocations to manage +the global symbol table and the loaded dso's. This data is never +freed. It will be needed again if libdl is used, so unconditionally +freeing it is not possible, but normal programs do not use libdl. Of +course with glibc all programs use libdl (due to nsswitch) so the +issue was never addressed.

+ +

ELF also has the issue that segments are not page-aligned on disk. +This saves up to 4k on disk, but at the expense of using an additional +dirty page in most cases, due to a large portion of the first data +page being filled with a duplicate copy of the last text page.

+ +

The above is just a partial list of the tiny memory penalties of ELF +dynamic linking, which eventually add up to quite a bit. The smallest +I've been able to get a process down to is 8 dirty pages, and the +above factors seem to mostly account for it (but some were difficult +to measure).

+
+ +
+

Including kernel headers

+ +

The "linux" or "asm" directories of /usr/include +contain Linux kernel +headers, so that the C library can talk directly to the Linux kernel. In +a perfect world, applications shouldn't include these headers directly, but +we don't live in a perfect world.

+ +

For example, Busybox's losetup code wants linux/loop.c because nothing else +#defines the structures to call the kernel's loopback device setup ioctls. +Attempts to cut and paste the information into a local busybox header file +proved incredibly painful, because portions of the loop_info structure vary by +architecture, namely the type __kernel_dev_t has different sizes on alpha, +arm, x86, and so on. Meaning we either #include <linux/posix_types.h> or +we hardwire #ifdefs to check what platform we're building on and define this +type appropriately for every single hardware architecture supported by +Linux, which is simply unworkable.

+ +

This is aside from the fact that the relevant type defined in +posix_types.h was renamed to __kernel_old_dev_t during the 2.5 series, so +to cut and paste the structure into our header we have to #include +<linux/version.h> to figure out which name to use. (What we actually +do is +check if we're building on 2.6, and if so just use the new 64 bit structure +instead to avoid the rename entirely.) But we still need the version +check, since 2.4 didn't have the 64 bit structure.

+ +

The BusyBox developers spent two years trying to figure +out a clean way to do all this. There isn't one. The losetup in the +util-linux package from kernel.org isn't doing it cleanly either, they just +hide the ugliness by nesting #include files. Their mount/loop.h +#includes "my_dev_t.h", which #includes <linux/posix_types.h> +and <linux/version.h> just like we do. There simply is no alternative. +

+ +

Just because directly #including kernel headers is sometimes +unavoidable doesn't me we should include them when there's a better +way to do it. However, block copying information out of the kernel headers +is not a better way.

+ +
+

Who are the BusyBox developers?

+ +

The following login accounts currently exist on busybox.net. (I.E. these +people can commit patches +into subversion for the BusyBox, uClibc, and buildroot projects.)

+ +
+aldot     :Bernhard Reutner-Fischer
+andersen  :Erik Andersen      - uClibc and BuildRoot maintainer.
+bug1      :Glenn McGrath
+davidm    :David McCullough
+gkajmowi  :Garrett Kajmowicz  - uClibc++ maintainer
+jbglaw    :Jan-Benedict Glaw
+jocke     :Joakim Tjernlund
+landley   :Rob Landley
+lethal    :Paul Mundt
+mjn3      :Manuel Novoa III
+osuadmin  :osuadmin
+pgf       :Paul Fox
+pkj       :Peter Kjellerstedt
+prpplague :David Anders
+psm       :Peter S. Mazinger
+russ      :Russ Dill
+sandman   :Robert Griebl
+sjhill    :Steven J. Hill
+solar     :Ned Ludd
+timr      :Tim Riker
+tobiasa   :Tobias Anderberg
+vapier    :Mike Frysinger
+vda       :Denys Vlasenko     - BusyBox maintainer
+
+ +

The following accounts used to exist on busybox.net, but don't anymore so +I can't ask /etc/passwd for their names. Rob Wentworth +<robwen at gmail.com> asked Google and recovered the names:

+ +
+aaronl   :Aaron Lehmann
+beppu    :John Beppu
+dwhedon  :David Whedon
+erik     :Erik Andersen
+gfeldman :Gennady Feldman
+jimg     :Jim Gleason
+kraai    :Matt Kraai
+markw    :Mark Whitley
+miles    :Miles Bader
+proski   :Pavel Roskin
+rjune    :Richard June
+tausq    :Randolph Chung
+vodz     :Vladimir N. Oleynik
+
+ + +
+
+
+ + -- cgit v1.2.3-54-g00ecf