Thinking of Linking
Articel from O"Reilly Network:
There are five essential, non-obvious conventions to master when using libraries. These aren"t explained very clearly in most C books or manuals, probably because the language documenters consider linking part of the surrounding operating system, while the operating system people view linking as part of the language. As a result, no one makes much more than a passing reference to it unless someone from the linker team gets involved! Here are the essential UNIX linking facts of life:
1. Dynamic libraries are called lib something .so , and static libraries are called lib something .a
By convention, all dynamic libraries have a filename of the form libname .so (version numbers may be appended to the name). Thus, the library of thread routines is called libthread .so. A static archive has a filename of the form libname.a. Shared archives, with names of the form libname .sa, were a transient phenomenon, helping in the transition from static to dynamic libraries. Shared archives are also obsolete now.
2. You tell the compiler to link with, for example, libthread.so by giving the option -lthread
The command line argument to the C compiler doesn"t mention the entire pathname to the library file. It doesn"t even mention the full name of the file in the library directory! Instead, the compiler is told to link against a library with the command line option -lname where the library is called libname .so?in other words, the "lib" part and the file extension are dropped, and -l is jammed on the beginning instead.
3. The compiler expects to find the libraries in certain directories
At this point, you may be wondering how the compiler knows in which directory to look for the libraries. Just as there are special rules for where to find header files, so the compiler looks in a few special places such as /usr/lib/ for libraries. For instance, the threads library is in /usr/lib/libthread.so.
The compiler option -Lpathname is used to tell the linker a list of other directories in which to search for libraries that have been specified with the -l option. There are a couple of environment variables, LD_LIBRARY_PATH and LD_RUN_PATH, that can also be used to provide this information. Using these environment variables is now officially frowned on, for reasons of security, performance, and build/execute independence. Use the -Lpathname -Rpathname options at linktime instead.
4. Identify your libraries by looking at the header files you have used
Another key question that may have occurred to you is, "How do I know which libraries I have to link with?" The answer, as (roughly speaking) enunciated by Obi-Wan Kenobi in Star Wars, is, "Use the source, Luke!" If you look at the source of your program, you"ll notice routines that you call, but which you didn"t implement. For example, if your program does trigonometry, you"ve probably called routines with names like sin() or cos(), and these are found in the math library. The manpages show the exact argument types each routine expects, and should mention the library it"s in.
A good hint is to study the #includes that your program uses. Each header file that you include potentially represents a library against which you must link. This tip carries over into C++, too. A big problem of name inconsistency shows up here. Header files usually do not have a name that looks anything like the name of the corresponding library. Sorry! This is one of the things you "just have to know" to be a C wizard. Table 5-1 shows examples of some common ones.
, , and are all usually supplied in the single library libc.so. If you"re in doubt, use the nm utility to list the routines that a library contains. More about this in the next heuristic!
Finally, there"s an additional and big difference in link semantics between dynamic linking and static linking that often confuses the unwary. Archives (static libraries) are acted upon differently than are shared objects (dynamic libraries). With dynamic libraries, all the library symbols go into the virtual address space of the output file, and all the symbols are available to all the other files in the link. In contrast, static linking only looks through the archive for the undefined symbols presently known to the loader at the time the archive is processed.
A simpler way of putting this is to say that the order of the statically linked libraries on the compiler command line is significant. The linker is fussy about where libraries are mentioned, and in what order, since symbols are resolved looking from left to right. This makes a difference if the same symbol is defined differently in two different libraries. If you"re doing this deliberately, you probably know enough not to need to be reminded of the perils.
Another problem occurs if you mention the static libraries before your own code. There won"t be any undefined symbols yet, so nothing will be extracted. Then, when your object file is processed by the linker, all its library references will be unfulfilled! Although the convention has been the same since UNIX started, many people find it unexpected; very few commands demand their arguments in a particular order, and those that do usually complain about it directly if you get it wrong. All novices have trouble with this aspect of linking until the concept is explained. Then they just have trouble with the concept itself.
The problem most frequently shows up when someone links with the math library. The math library is heavily used in many benchmarks and applications, so we want to squeeze the last nanosecond of runtime performance out of it. As a result, libm has often been a statically linked archive. So if you have a program that uses some math routines such as the sin() function, and you link statically like this:
cc -lm main.c
you will get an error message like this:
Undefined first referenced
symbol in file
sin main.o
ld: fatal: Symbol referencing errors. No output written to a.out
There are five essential, non-obvious conventions to master when using libraries. These aren"t explained very clearly in most C books or manuals, probably because the language documenters consider linking part of the surrounding operating system, while the operating system people view linking as part of the language. As a result, no one makes much more than a passing reference to it unless someone from the linker team gets involved! Here are the essential UNIX linking facts of life:
1. Dynamic libraries are called lib something .so , and static libraries are called lib something .a
By convention, all dynamic libraries have a filename of the form libname .so (version numbers may be appended to the name). Thus, the library of thread routines is called libthread .so. A static archive has a filename of the form libname.a. Shared archives, with names of the form libname .sa, were a transient phenomenon, helping in the transition from static to dynamic libraries. Shared archives are also obsolete now.
2. You tell the compiler to link with, for example, libthread.so by giving the option -lthread
The command line argument to the C compiler doesn"t mention the entire pathname to the library file. It doesn"t even mention the full name of the file in the library directory! Instead, the compiler is told to link against a library with the command line option -lname where the library is called libname .so?in other words, the "lib" part and the file extension are dropped, and -l is jammed on the beginning instead.
3. The compiler expects to find the libraries in certain directories
At this point, you may be wondering how the compiler knows in which directory to look for the libraries. Just as there are special rules for where to find header files, so the compiler looks in a few special places such as /usr/lib/ for libraries. For instance, the threads library is in /usr/lib/libthread.so.
The compiler option -Lpathname is used to tell the linker a list of other directories in which to search for libraries that have been specified with the -l option. There are a couple of environment variables, LD_LIBRARY_PATH and LD_RUN_PATH, that can also be used to provide this information. Using these environment variables is now officially frowned on, for reasons of security, performance, and build/execute independence. Use the -Lpathname -Rpathname options at linktime instead.
4. Identify your libraries by looking at the header files you have used
Another key question that may have occurred to you is, "How do I know which libraries I have to link with?" The answer, as (roughly speaking) enunciated by Obi-Wan Kenobi in Star Wars, is, "Use the source, Luke!" If you look at the source of your program, you"ll notice routines that you call, but which you didn"t implement. For example, if your program does trigonometry, you"ve probably called routines with names like sin() or cos(), and these are found in the math library. The manpages show the exact argument types each routine expects, and should mention the library it"s in.
A good hint is to study the #includes that your program uses. Each header file that you include potentially represents a library against which you must link. This tip carries over into C++, too. A big problem of name inconsistency shows up here. Header files usually do not have a name that looks anything like the name of the corresponding library. Sorry! This is one of the things you "just have to know" to be a C wizard. Table 5-1 shows examples of some common ones.
Table 5-1. Library Conventions Under Solaris 2.x
#include Filename - Library Pathname - Compiler option to Use
- /usr/lib/libm.so - -lm
- /usr/lib/libm.a - -dn -lm
- /usr/lib/libc.so - linked in automatically
"/usr/openwin/include/X11.h" - /usr/openwin/lib/libX11.so - -L/usr/openwin/lib -lX11
- /usr/lib/libthread.so - -lthread
- /usr/ccs/lib/libcurses.a - -lcurses
- /usr/lib/libsocket.so - -lsocket
Another inconsistency is that a single library may contain routines that satisfy the prototypes declared in multiple header files. For example, the functions declared in the header files #include Filename - Library Pathname - Compiler option to Use
"/usr/openwin/include/X11.h" - /usr/openwin/lib/libX11.so - -L/usr/openwin/lib -lX11
Handy Heuristic - How to Match a Symbol with its Library
If you"re trying to link a program and get this kind of error:
ld: Undefined symbol
_xdr_reference
*** Error code 2
make: Fatal error: Command failed for target "prog"
Here"s how you can locate the libraries with which you need to link. The basic plan is to use nm to look through the symbols in every library in /usr/lib, grepping for the symbols you"re missing. The linker looks in /usr/ccs/lib and /usr/lib by default, and so should you. If this doesn"t get results, extend your search to all other library directories (such as /usr/openwin/lib), too.
% cd /usr/lib
% foreach i (lib?*)
? echo $i
? nm $i | grep xdr_reference | grep -v UNDEF
? end
libc.so
libnsl.so
[2491] | 217028| 196|FUNC |GLOB |0 |8 |xdr_reference
libposix4.so
This runs "nm" on each library in the directory, to list the symbols known in the library. Pipe it through grep to limit it to the symbol you are searching for, and filter out symbols marked as "UNDEF" (referenced, but not defined in this library). The result shows you that xdr_reference is in libnsl. You need to add -lnsl on the end of the compiler command line.
5. Symbols from static libraries are extracted in a more restricted way than symbols from dynamic libraries If you"re trying to link a program and get this kind of error:
ld: Undefined symbol
_xdr_reference
*** Error code 2
make: Fatal error: Command failed for target "prog"
Here"s how you can locate the libraries with which you need to link. The basic plan is to use nm to look through the symbols in every library in /usr/lib, grepping for the symbols you"re missing. The linker looks in /usr/ccs/lib and /usr/lib by default, and so should you. If this doesn"t get results, extend your search to all other library directories (such as /usr/openwin/lib), too.
% cd /usr/lib
% foreach i (lib?*)
? echo $i
? nm $i | grep xdr_reference | grep -v UNDEF
? end
libc.so
libnsl.so
[2491] | 217028| 196|FUNC |GLOB |0 |8 |xdr_reference
libposix4.so
This runs "nm" on each library in the directory, to list the symbols known in the library. Pipe it through grep to limit it to the symbol you are searching for, and filter out symbols marked as "UNDEF" (referenced, but not defined in this library). The result shows you that xdr_reference is in libnsl. You need to add -lnsl on the end of the compiler command line.
Finally, there"s an additional and big difference in link semantics between dynamic linking and static linking that often confuses the unwary. Archives (static libraries) are acted upon differently than are shared objects (dynamic libraries). With dynamic libraries, all the library symbols go into the virtual address space of the output file, and all the symbols are available to all the other files in the link. In contrast, static linking only looks through the archive for the undefined symbols presently known to the loader at the time the archive is processed.
A simpler way of putting this is to say that the order of the statically linked libraries on the compiler command line is significant. The linker is fussy about where libraries are mentioned, and in what order, since symbols are resolved looking from left to right. This makes a difference if the same symbol is defined differently in two different libraries. If you"re doing this deliberately, you probably know enough not to need to be reminded of the perils.
Another problem occurs if you mention the static libraries before your own code. There won"t be any undefined symbols yet, so nothing will be extracted. Then, when your object file is processed by the linker, all its library references will be unfulfilled! Although the convention has been the same since UNIX started, many people find it unexpected; very few commands demand their arguments in a particular order, and those that do usually complain about it directly if you get it wrong. All novices have trouble with this aspect of linking until the concept is explained. Then they just have trouble with the concept itself.
The problem most frequently shows up when someone links with the math library. The math library is heavily used in many benchmarks and applications, so we want to squeeze the last nanosecond of runtime performance out of it. As a result, libm has often been a statically linked archive. So if you have a program that uses some math routines such as the sin() function, and you link statically like this:
cc -lm main.c
you will get an error message like this:
Undefined first referenced
symbol in file
sin main.o
ld: fatal: Symbol referencing errors. No output written to a.out
