Casting pointers to references
Casting a pointer (like Foo *) to a reference (like Foo &) via reinterpret_cast or a C-style cast probably doesn't do what you want.
References ("refs") exist so that you can make libraries with user-defined constructs that "feel like" a built-in language abstraction. Refs are definitely confusing if you've transitioned from C to C++ — they're "pointerish" in the sense that the compiler ultimately boils them down to pointer values, but "not" in the sense that the language semantics restrict their use. [*]
I came across one such casting bug today, and wondered what the compiler actually emits for it.
As it turns out, GCC warns when you cast a pointer to its corresponding ref type:
test.cpp:12:23: warning: casting ‘int*’ to ‘int&’ does not dereference pointer
Unfortunately, if you cast it to a corresponding const ref type it stays silent. Consider this snippet of C++ code:
#include <stdio.h>
extern int SomeGlobal;
void DumpValue(const int &value)
{
printf("%d\n", value);
}
int main() {
int *pval = &SomeGlobal;
DumpValue((const int &) pval);
return 0;
}
Note that the correct approach is to use the deref operator (*) on pval to turn it into an int &, which is compatible with the const int & signature of DumpValue.
After a quick give-me-the-assembly command line sequence:
g++ -o test.o -c test.cpp objdump -d -r test.o # Get assembly with inline linker relocation directives.
We can see the resulting x64 assembly:
0000000000000025 <main>:
25: 55 push %rbp
26: 48 89 e5 mov %rsp,%rbp
29: 48 83 ec 10 sub $0x10,%rsp
2d: 48 c7 45 f8 00 00 00 movq $0x0,-0x8(%rbp)
34: 00
31: R_X86_64_32S SomeGlobal
35: 48 8d 45 f8 lea -0x8(%rbp),%rax
39: 48 89 c7 mov %rax,%rdi
3c: e8 00 00 00 00 callq 41 <main+0x1c>
3d: R_X86_64_PC32 _Z9DumpValueRi-0x4
41: b8 00 00 00 00 mov $0x0,%eax
46: c9 leaveq
47: c3 retqWalking through it step by step:
Instruction 2d is placing the address of SomeGlobal into the stack frame, at location -0x8(%rbp). [†] It currently has $0x0 as a value, with a note for the linker to replace that with the address for SomeGlobal when the linking process figures out where SomeGlobal lives.
Instruction 35 computes the address of that stack slot with a lea instruction (which is like a fancy-pants add).
Instructions 35 and 39 make that address of the stack slot into the first argument (%rdi) to DumpValue.
So, the argument won't contain the address of SomeGlobal, like we were hoping to provide to DumpValue, but the stack slot address instead. [‡] The cast resulted in a pointer to its operand — the behavior that you would expect if you took a value type and casted it to a ref, like so:
#include <stdio.h>
struct MyStruct {
int foo, bar;
};
void DumpValues(const MyStruct &ms)
{
printf("%d %d\n", ms.foo, ms.bar);
}
int main(void) {
MyStruct ms = {42, 1024};
DumpValues(reinterpret_cast<const MyStruct &>(ms));
return 0;
}
Footnotes
| [*] | See ISO C++ (14882:2003) 8.3.2 #4:
|
| [†] | Recall that on x64, the stack grows "down" in memory space; i.e. as you push more function frames due to function invocation, the value in %rsp gets smaller. The base pointer is at the start of the frame, in the highest address, and the stack pointer %rsp is at the end of the frame, in the lowest address. The return address is at 8(%rbp), the previous frame's %rbp value is at 0(%rbp), and the first local stack slot for this function is -8(%rbp). |
| [‡] | On an LP64 system like my x64 Linux machine we can see half of the stack slot value through this reference. |