Flaws found in Zed Shaw’s Learn C The Hard Way

Learn C The Hard Way is an “in-progress” book intended to teach you the C programming language. It is written by the all-knowing Zed Shaw. If you are reading this in the future, the mistakes listed here might have already been fixed by now. Let’s jump into unravelling his screw-ups:

Screw-up #1: Assigning const char * to char *

Look at the following segment taken from Exercise 15:

    char *names[] = {
        "Alan", "Frank",
        "Mary", "John", "Lisa"
    };

Can anyone tell me why this is wrong? the variable names is an array of pointers to char. If I inserted the following line after this declaration:

names[0][2] = 'X';

It will still compile, however the program will segfault. names should have been declared as const char *names[] since he’s using constant c-string literals as the initialization values. What a baby mistake. Throughout the entire book he keeps making the same mistake of assigning constant string literals (which are pointers to constant characters) to pointers to non-constant characters. I’m not going to mention every instance of this mistake because they are too numerous. Moving on.

Screw-up #2: Assuming that int is 32-bits

This gem is taken from chapter 21: “Advanced Data Types and Control Flow”:

int
Stores a regular integer, defaulting to 32 bits in size.

According to the ISO/IEC 9899:1999 specification, section 5.2.4.2.1, page 22, the minimum range of type int is -32768 through 32767. This means that integers are to have a minimum width of 16-bits. The actual size is implementation dependent. It is naive on Zed’s part to assume 32-bits for int.

Screw-up #3: Calling printf 5 times in a row

From chapter 25:

    printf("---- RESULTS ----\n");
    printf("First Name: %s", first_name);
    printf("Initial: '%c'\n", initial);
    printf("Last Name: %s", last_name);
    printf("Age: %d\n", age);

Maybe a smart enough compiler would optimize this, but this is just redundant. If it’s not optimized, then there will be extra machine instructions for entering into the printf function and initializing all the local variables five times each when this could’ve been saved by replacing it with one call to printf:

printf(
    "--- RESULTS ---\n"
    "First Name: %s"
    "Initial: \'%c\'\n"
    "Last Name: %s"
    "Age: %d\n",
    first_name, initial, last_name, age
);

You could say that it doesn’t matter and it’s more of a style preference. Well I say you look like a 12-year-old script kiddie if you’re doing multiple calls to printf in a row.

 

Screw-up #4: Zed Shaw’s entire data structure library

Let’s start off with his linked list data structure. Zed Shaw’s linked lists have too much unnecessary memory allocations. If you wanted to have a linked list of integers, his library would give you a pointer to a linked list of pointers to integers. Each pointer refers to a memory location allocated on the heap.

Here’s his List_create function:

List *List_create()
{
    return calloc(1, sizeof(List));
}

Oh great, so instead of declaring List as a local variable and having List_create initialize it, this function will allocate one using calloc and return a pointer to the new object. Yeah it looks better syntactically, but this is one heap allocation that I don’t need.

Get a load of the List_push prototype:

void List_push(List *list, void *value);

That’s right, the linked list doesn’t actually store values. It stores pointers to your values. If you try to use List_push on a non-global or non-dynamically allocated variable, the value will disappear when it loses scope and you might get a segfault if you try to access it later. If Zed Shaw knew C well enough to know about zero length arrays, his data structure implementation wouldn’t be as retarded as this.

He applies this same concept to his dynamic array data structure too. And I quote his description of a dynamic array:

A dynamic array is simply an array of void ** pointers that is pre-allocated in one shot and that point at the data.

That is not the definition of a dynamic array implementation, that’s the definition of a shitty dynamic array implemention. Just like his linked lists, Zed’s DArray data structure doesn’t store values. It stores pointers to values. This time, he includes a handy DArray_New function for allocating a new element so you can store it into DArray’s internal array. The DArray object itself is dynamically allocated. So if you want to have an array of 1 integer, you have to do the following: allocate one integer on the heap, allocate a DArray object on the heap, have the DArray object allocate an array on the heap for your integer. That’s 3 heap allocations for one integer, it’s starting to sound like Java now. A regular dynamic array would just allocate one array on the heap and store your integer’s value.

The reality is that you only need void * to be used as the base address of the array. And the entire array should fit into one contiguous block of memory allocated by malloc, realloc, or calloc. Storing values into the array means means copying values into the array’s memory space, not storing pointers to those values which might have temporary scope. The array can then expand using realloc.

Screw-up #5: Not using getaddrinfo properly

In excersice 45, he gives us a function for establishing a TCP connection:

int client_connect(char *host, char *port)
{
    int rc = 0;
    struct addrinfo *addr = NULL;

    rc = getaddrinfo(host, port, NULL, &addr);
    check(rc == 0, "Failed to lookup %s:%s", host, port);

    int sock = socket(AF_INET, SOCK_STREAM, 0);
    check(sock >= 0, "Cannot create a socket.");

    rc = connect(sock, addr->ai_addr, addr->ai_addrlen);
    check(rc == 0, "Connect failed.");

    rc = nonblock(sock);
    check(rc == 0, "Can't set nonblocking.");

    freeaddrinfo(addr);
    return sock;

error:
    freeaddrinfo(addr);
    return -1;
}

I’m not going to explain the every detail of getaddrinfo, you can find out by looking at the man pages. But just know that he specified the 3rd parameter as NULL. This means that the addrinfo structure(s) returned by getaddrinfo may not necessarily contain the address suitable for a stream-oriented, tcp socket. He should have defined an additional addrinfo structure for the hints parameter, containing: AF_UNSPEC for ai_family, SOCK_STREAM for ai_socktype, and IPPROTO_TCP for ai_protocol.