Linker and Loader
overview
Program libraries can be divided into three types: static libraries, shared libraries, and dynamically loaded (DL) libraries(plugin).
static lib is part of the application, they are just some simple object file archive. dynamic lib is missing puzzle of the application loaded by OS. plugin is lib with predefined interface, and are explicit loaded by application not OS
compiler side:
A definition of a variable induces the compiler to reserve some space for that variable, and possibly fill that space with a particular value.
A definition of a function induces the compiler to generate code for that function.
and declaration promise the compiler that this symbol is someplace else.
then we got a .o file with code and data(global variables) the compiler leave a blank when see a declaration without implementation.
linker side: filling the blank, relocate, make object files to a executable.
- static link: the content of file are physically inserted into executable.(order of the events: cyclic dependency)
- shared library(defer to dynamic link): it doesn’t include in the final executable, but recored the name of the library, on run time, os pull the code of library and join all part before main function is run. (nm & ldd). this is why dll hell, loader re-link every time it runs. the crt.h is linked which delcared dynamic library but didn’t define it. the implementation is loaded on run-time.
- plugin: even more defer linking.(dlopen), it only bring in the dynamic linked file until invokation.
loader side:
- load code/ text segment
- bss to save space on disk
Phase Static Dynamic
-------- ---------------------- ------------------------
+---------+ +---------+
| main.c | | main.c |
+---------+ +---------+
Compile........|.........................|...................
+---------+ +---------+ +---------+ +--------+
| main.o | | crt.a | | main.o | | crt.h |
+---------+ +---------+ +---------+ +--------+
Link...........|..........|..............|...........|.......
| | +-----------+
| | |
+---------+ | +---------+ +--------+
| main |-----+ | main | | crt.so |
+---------+ +---------+ +--------+
Load/Run.......|.........................|..........|........
+---------+ +---------+ |
| main in | | main in |-----+
| memory | | memory |
+---------+ +---------+
Dynamic linking
the job of dynamic linker and loader(ld.so) is to resolve the executable’s dependencies on shared libraries and to load the required ones at run-time. readelf .bin to find the NEDDED headers and search the DSO with a matching SONAME
demo in action
- gcc -c -fpic foo.c => foo.o
- gcc -shared -o libfoo.so foo.o =>libfoo.so
- gcc -Wall -o test main.c -lfoo => error, cannot find -lfoo *. gcc -o test main.c -L$PWD -lfoo => good *. gcc -o test main.c -I$PWD/include -L$PWD/lib -lfoo => even better
- run it => loader can’t find shared libfoo.so because it is not in standard path (/usr/lib /lib)
- LD_LIBRARY_PATH, and export it
- rpath or runpath
- ld.so.conf ldconfig
ldconfig -p | grep foo
- -I for include, -L for library, -l for what you are looking for
chroot
put web server or build farm into a chroot. it is need to copy all files required insided rooted /jail/ directory, including bin, lib, modules and configuratation files. and also /etc/{ld.so.cache, ld.so.conf}
and /etc/ld.so.conf.d
to /jail/etc/
cp_support_shared_libs “/jail” “/usr/local/nginx/sbin/nginx” code example
ltrace
ltrace simply intercepts and records dynamic library calls and signals received. it could also intercept the system calls. it is the open write syscall that may have performace issue.
ltrace -S -tt ./foo
06:49:35.810781 SYS_brk(0) = 0x14db000
06:49:35.811163 SYS_access("/etc/ld.so.nohwcap", 00) = -2
06:49:35.811597 SYS_mmap(0, 8192, 3, 34) = 0x7fefa706b000
06:49:35.811829 SYS_access("/etc/ld.so.preload", 04) = -2
06:49:35.812194 SYS_open("/etc/ld.so.cache", 524288, 01) = 3
06:49:35.812494 SYS_fstat(3, 0x7fff16d37cc0) = 0
06:49:35.812604 SYS_mmap(0, 0x178e3, 1, 2) = 0x7fefa7053000
06:49:35.812778 SYS_close(3) = 0
06:49:35.812911 SYS_access("/etc/ld.so.nohwcap", 00) = -2
06:49:35.813185 SYS_open("/lib/x86_64-linux-gnu/libc.so.6", 524288, 024701570710) = 3
06:49:35.813620 SYS_read(3, "\177ELF\002\001\001", 832) = 832
06:49:35.813809 SYS_fstat(3, 0x7fff16d37d10) = 0
06:49:35.813924 SYS_mmap(0, 0x3c52c0, 5, 2050) = 0x7fefa6a85000
06:49:35.814122 SYS_mprotect(0x7fefa6c41000, 2093056, 0) = 0
06:49:35.814266 SYS_mmap(0x7fefa6e40000, 0x6000, 3, 2066) = 0x7fefa6e40000
06:49:35.814439 SYS_mmap(0x7fefa6e46000, 0x42c0, 3, 50) = 0x7fefa6e46000
06:49:35.814624 SYS_close(3) = 0
06:49:35.814755 SYS_mmap(0, 4096, 3, 34) = 0x7fefa7052000
06:49:35.814935 SYS_mmap(0, 8192, 3, 34) = 0x7fefa7050000
06:49:35.815115 SYS_arch_prctl(4098, 0x7fefa7050740, 0x7fefa7051050, 34) = 0
06:49:35.815367 SYS_mprotect(0x7fefa6e40000, 16384, 1) = 0
06:49:35.815626 SYS_mprotect(0x600000, 4096, 1) = 0
06:49:35.815782 SYS_mprotect(0x7fefa706d000, 4096, 1) = 0
06:49:35.815916 SYS_munmap(0x7fefa7053000, 96483) = 0
06:49:35.817359 __libc_start_main(0x40052d, 1, 0x7fff16d38698, 0x400550 <unfinished ...>
06:49:35.817764 puts("Hello world" <unfinished ...>
06:49:35.818278 SYS_fstat(1, 0x7fff16d38460) = 0
06:49:35.818538 SYS_mmap(0, 4096, 3, 34) = 0x7fefa706a000
06:49:35.818866 SYS_write(1, "Hello world\n", 12Hello world
) = 12
06:49:35.819178 <... puts resumed> ) = 12
06:49:35.819332 SYS_exit_group(0 <no return ...>
06:49:35.819501 +++ exited (status 0) +++
or ltrace -e malloc,free ./foo
ltrace -e fopen,fread,fwrite,fclose ./foo
Implications for installation and invocation
In brief, the implications include these alternatives:
- Put new .so’s in directories where the ld-linux already looks, such as /usr/lib (ie: InstallDir equals /usr). Or…
- Add configuration information for ld-config so that it looks in new directories that your installation has added. Or…
- Have users launch programs that need your .so’s using a script that augments the LD_LIBRARY_PATH environment variable to tell ld-config additional places to look. (pretty saft as it will not impact other process)
- The developers could include instructions into the executable (possible settable using configure during install) as to where to tell ld-linux to find libraries.
Note that if the new package includes .so’s just for its own use only, it will still need ld to load those .so’s, and thus need to use one of the methods to tell ld where to find them.
linker and sonames
Every shared library has a special name called the soname''. The soname has the prefix
lib’’, the name of the library, the phrase ``.so’’, followed by a period and a version number that is incremented whenever the interface changes
The key to managing shared libraries is the separation of these names. Programs, when they internally list the shared libraries they need, should only list the soname they need. Conversely, when you create a shared library, you only create the library with a specific filename (with more detailed version information). When you install a new version of a library, you install it in one of a few special directories and then run the program ldconfig. ldconfig examines the existing files and creates the sonames as symbolic links to the real names, as well as setting up the cache file /etc/ld.so.cache (described in a moment).
ldconfig doesn’t set up the linker names; typically this is done during library installation, and the linker name is simply created as a symbolic link to the ``latest’’ soname or the latest real name. I would recommend having the linker name be a symbolic link to the soname, since in most cases if you update the library you’d like to automatically use it when linking. I asked H. J. Lu why ldconfig doesn’t automatically set up the linker names. His explanation was basically that you might want to run code using the latest version of a library, but might instead want development to link against an old (possibly incompatible) library. Therefore, ldconfig makes no assumptions about what you want programs to link to, so installers must specifically modify symbolic links to update what the linker will use for a library.
Thus, /usr/lib/libreadline.so.3 is a fully-qualified soname, which ldconfig would set to be a symbolic link to some realname like /usr/lib/libreadline.so.3.0. There should also be a linker name, /usr/lib/libreadline.so which could be a symbolic link referring to /usr/lib/libreadline.so.3.
in debian’s convention, run-time(libfoo1: /usr/lib/libfoo.so.1-> libfoo.so.1.0.0) or dev (libfoo-dev: /usr/lib/libfoo.so->libfoo.so.1.0.0)
first of all there are several different name fo a shared library:
- the actual name of the file
- the name for build-time linker(ld)
- the name for dynamic linker(ld.so)
so we got symbolic links
when build-time linking, the ld take either /usr/lib/libfoo.so.0.0.1
or via -foo
, the path where the linker search for this libfoo is discussed above.
when build-time linker add this libfoo.so to the output, it need to name it. that’s SONAME. the linker will DT_SONAME and DT_NEEDED entry tag of the output file. but the loader only care about DT_NEEDED.
as a result, the DSO loader find don’t have to be the same one as build-time linker). the linker will load it anyway, but fail in a subtle ways.
to ensure the loader is able to find the library, libtool
works at build-time and install-time, to make both symbolic link used by loader and dynamic linker.
the second half is ldconfig, which recreate all symlinks to sonames in library search path
create shared library
gcc -shared -Wl,-soname,your_soname \
-o library_name file_list library_list
or
gcc -fPIC -g -c -Wall a.c
gcc -fPIC -g -c -Wall b.c
gcc -shared -Wl,-soname,libmystuff.so.1 \
-o libmystuff.so.1.0.1 a.o b.o -lc
install a shared library
First, you’ll need to create the shared libraries somewhere, like above.
Then, you’ll need to set up the necessary symbolic links, in particular a link from a soname to the real name (as well as from a versionless soname, that is, a soname that ends in .so for users who don’t specify a version at all). The simplest approach is to run:ldconfig -n directory_with_shared_libraries
Finally, when you compile your programs, you’ll need to tell the linker about any static and shared libraries that you’re using. Use the -l and -L options for this.
Dynamically Loaded (DL) Libraries
#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>
int main(int argc, char **argv) {
void *handle;
double (*cosine)(double);
char *error;
handle = dlopen ("/lib/libm.so.6", RTLD_LAZY);
if (!handle) {
fputs (dlerror(), stderr);
exit(1);
}
cosine = dlsym(handle, "cos");
if ((error = dlerror()) != NULL) {
fputs(error, stderr);
exit(1);
}
printf ("%f\n", (*cosine)(2.0));
dlclose(handle);
}