This question follows from another question that I asked earlier. In short, this is one of my attempts to combine two fully linked executables into one fully linked executable. The difference is that the previous question is about merging the object file with a fully related executable, which is even more complicated because it means that I need to manually handle the movements.
I have the following files:
example-target.c :
#include <stdlib.h> #include <stdio.h> int main(void) { puts("1234"); return EXIT_SUCCESS; }
example-embed.c :
#include <stdlib.h>
My goal is to combine these two executables to create the final executable, which is the same as example-target but additionally has other main and func1 .
From the point of view of the BFD library, each binary group consists (among other things) of a set of sections. One of the first problems I encountered was that in these sections there were conflicting download addresses (such that if I combined them, the sections would overlap).
What I did to solve this problem was to analyze example-target programmatically to get a list of download addresses and sizes of each of its sections. Then I did the same for example-embed and used this information to dynamically create the linker command for example-embed.c , which ensures that all its sections are connected at addresses that do not intersect with any of the sections in example-target . Therefore, example-embed actually completely connected twice in this process: once, to determine how many partitions and what sizes they are, and once again to connect with the guarantee that there is no collision of the section with example-target .
The linker command is created on my system:
-Wl,--section-start=.new.interp=0x1004238,--section-start=.new.note.ABI-tag=0x1004254, --section-start=.new.note.gnu.build-id=0x1004274,--section-start=.new.gnu.hash=0x1004298, --section-start=.new.dynsym=0x10042B8,--section-start=.new.dynstr=0x1004318, --section-start=.new.gnu.version=0x1004356,--section-start=.new.gnu.version_r=0x1004360, --section-start=.new.rela.dyn=0x1004380,--section-start=.new.rela.plt=0x1004398, --section-start=.new.init=0x10043C8,--section-start=.new.plt=0x10043E0, --section-start=.new.text=0x1004410,--section-start=.new.fini=0x10045E8, --section-start=.new.rodata=0x10045F8,--section-start=.new.eh_frame_hdr=0x1004604, --section-start=.new.eh_frame=0x1004638,--section-start=.new.ctors=0x1204E28, --section-start=.new.dtors=0x1204E38,--section-start=.new.jcr=0x1204E48, --section-start=.new.dynamic=0x1204E50,--section-start=.new.got=0x1204FE0, --section-start=.new.got.plt=0x1204FE8,--section-start=.new.data=0x1205010, --section-start=.new.bss=0x1205020,--section-start=.new.comment=0xC04000
(Note that I prefix the section names .new with objcopy --prefix-sections=.new example-embedobj to avoid section name conflicts.)
Then I wrote code to create a new executable file (I borrowed some code from both objcopy and Security Warrior book). The new executable must have:
- All
example-target sections and all example-embed sections - A character table containing all characters from
example-target and all characters from example-embed
The code I wrote is:
#include <stdlib.h> #include <stdio.h> #include <stdbool.h> #include <bfd.h> #include <libiberty.h> struct COPYSECTION_DATA { bfd * obfd; asymbol ** syms; int symsize; int symcount; }; void copy_section(bfd * ibfd, asection * section, PTR data) { struct COPYSECTION_DATA * csd = data; bfd * obfd = csd->obfd; asection * s; long size, count, sz_reloc; if((bfd_get_section_flags(ibfd, section) & SEC_GROUP) != 0) { return; } /* get output section from input section struct */ s = section->output_section; /* get sizes for copy */ size = bfd_get_section_size(section); sz_reloc = bfd_get_reloc_upper_bound(ibfd, section); if(!sz_reloc) { /* no relocations */ bfd_set_reloc(obfd, s, NULL, 0); } else if(sz_reloc > 0) { arelent ** buf; /* build relocations */ buf = xmalloc(sz_reloc); count = bfd_canonicalize_reloc(ibfd, section, buf, csd->syms); /* set relocations for the output section */ bfd_set_reloc(obfd, s, count ? buf : NULL, count); free(buf); } /* get input section contents, set output section contents */ if(section->flags & SEC_HAS_CONTENTS) { bfd_byte * memhunk = NULL; bfd_get_full_section_contents(ibfd, section, &memhunk); bfd_set_section_contents(obfd, s, memhunk, 0, size); free(memhunk); } } void define_section(bfd * ibfd, asection * section, PTR data) { bfd * obfd = data; asection * s = bfd_make_section_anyway_with_flags(obfd, section->name, bfd_get_section_flags(ibfd, section)); /* set size to same as ibfd section */ bfd_set_section_size(obfd, s, bfd_section_size(ibfd, section)); /* set vma */ bfd_set_section_vma(obfd, s, bfd_section_vma(ibfd, section)); /* set load address */ s->lma = section->lma; /* set alignment -- the power 2 will be raised to */ bfd_set_section_alignment(obfd, s, bfd_section_alignment(ibfd, section)); s->alignment_power = section->alignment_power; /* link the output section to the input section */ section->output_section = s; section->output_offset = 0; /* copy merge entity size */ s->entsize = section->entsize; /* copy private BFD data from ibfd section to obfd section */ bfd_copy_private_section_data(ibfd, section, obfd, s); } void merge_symtable(bfd * ibfd, bfd * embedbfd, bfd * obfd, struct COPYSECTION_DATA * csd) { /* set obfd */ csd->obfd = obfd; /* get required size for both symbol tables and allocate memory */ csd->symsize = bfd_get_symtab_upper_bound(ibfd) /********+ bfd_get_symtab_upper_bound(embedbfd) */; csd->syms = xmalloc(csd->symsize); csd->symcount = bfd_canonicalize_symtab (ibfd, csd->syms); /******** csd->symcount += bfd_canonicalize_symtab (embedbfd, csd->syms + csd->symcount); */ /* copy merged symbol table to obfd */ bfd_set_symtab(obfd, csd->syms, csd->symcount); } bool merge_object(bfd * ibfd, bfd * embedbfd, bfd * obfd) { struct COPYSECTION_DATA csd = {0}; if(!ibfd || !embedbfd || !obfd) { return FALSE; } /* set output parameters to ibfd settings */ bfd_set_format(obfd, bfd_get_format(ibfd)); bfd_set_arch_mach(obfd, bfd_get_arch(ibfd), bfd_get_mach(ibfd)); bfd_set_file_flags(obfd, bfd_get_file_flags(ibfd) & bfd_applicable_file_flags(obfd)); /* set the entry point of obfd */ bfd_set_start_address(obfd, bfd_get_start_address(ibfd)); /* define sections for output file */ bfd_map_over_sections(ibfd, define_section, obfd); /******** bfd_map_over_sections(embedbfd, define_section, obfd); */ /* merge private data into obfd */ bfd_merge_private_bfd_data(ibfd, obfd); /******** bfd_merge_private_bfd_data(embedbfd, obfd); */ merge_symtable(ibfd, embedbfd, obfd, &csd); bfd_map_over_sections(ibfd, copy_section, &csd); /******** bfd_map_over_sections(embedbfd, copy_section, &csd); */ free(csd.syms); return TRUE; } int main(int argc, char **argv) { bfd * ibfd; bfd * embedbfd; bfd * obfd; if(argc != 4) { perror("Usage: infile embedfile outfile\n"); xexit(-1); } bfd_init(); ibfd = bfd_openr(argv[1], NULL); embedbfd = bfd_openr(argv[2], NULL); if(ibfd == NULL || embedbfd == NULL) { perror("asdfasdf"); xexit(-1); } if(!bfd_check_format(ibfd, bfd_object) || !bfd_check_format(embedbfd, bfd_object)) { perror("File format error"); xexit(-1); } obfd = bfd_openw(argv[3], NULL); bfd_set_format(obfd, bfd_object); if(!(merge_object(ibfd, embedbfd, obfd))) { perror("Error merging input/obj"); xexit(-1); } bfd_close(ibfd); bfd_close(embedbfd); bfd_close(obfd); return EXIT_SUCCESS; }
To summarize what this code does, it takes 2 input files ( ibfd and embedbfd ) to create the output file ( embedbfd ).
- Copies the format / arch / mach / file flags and starting address from
ibfd to obfd - Defines partitions from
ibfd and embedbfd to obfd . Partition populations occur separately, as the BFD provides that all partitions are created before filling begins. - Merge the personal data of both input BFDs to output BFDs. Because BFD is a common abstraction above many file formats, it may not necessarily fully encapsulate everything that is required in the basic file format.
- Create a concatenated character table consisting of the
ibfd and embedbfd character table and set it as the obfd character obfd . This symbol table is saved, so you can later use it to create movement information. - Copy partitions from
ibfd to obfd . In addition to copying the contents of the section, this step also applies to creating and setting up a navigation table.
In the above code, some lines are commented out with /******** */ . These lines relate to the example-embed merge. If they are commented out, what happens is that obfd simply created as a copy of ibfd . I tested this and it works great. However, as soon as I comment on these lines in case of problems.
With a non-commented version that does a full merge, it still generates an output file. This output file can be checked with objdump and find all sections, codes and symbol tables of both inputs. However, objdump complains:
BFD: BFD (GNU Binutils for Ubuntu) 2.21.53.20110810 assertion fail ../../bfd/elf.c:1708 BFD: BFD (GNU Binutils for Ubuntu) 2.21.53.20110810 assertion fail ../../bfd/elf.c:1708
On my system 1708 of elf.c :
BFD_ASSERT (elf_dynsymtab (abfd) == 0);
elf_dynsymtab - macro in elf-bfd.h for:
#define elf_dynsymtab(bfd) (elf_tdata(bfd) -> dynsymtab_section)
I am not familiar with the ELF level, but I believe that this is the problem of reading the dynamic symbol table (or, possibly, its absence). Over time, I try to avoid direct contact with the ELF layer, if necessary. Can someone tell me what I am doing wrong in my code or conceptually?
If this is useful, I can also publish code to generate linker commands or compiled versions of the sample binaries.
I understand that this is a very big question, and for this reason I would like to properly reward everyone who can help me. If I can solve this with the help of someone, I will gladly reward the 500 + bonus.