Skip to content

Ruan-pysoft/cvcpp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Comparison between C and C++

A few people have asked me in the past few weeks what the difference between C and C++ is.

I thought a practical demonstration could come in handy, and as such I have put together a few examples.

Note that I am here using the C17 and C++11 standards, and so all of the fanciest new features won't be available (such as std::optional from C++17, std::println from C++23, or #elifdef from C23).

However, as far as I am aware, these standards have wide compiler support and are certainly not lacking any critical functionality.

Setup of the repository and prerequisites

Note that I have set this repository up to be used with Linux, and specifically the GNU Compiler Collection c and c++ compilers, as well as the make software for building.

On Ubuntu and most Debian-based distros, you should be able to install these programs with apt update && apt install build-essential, and likely also from your graphical package manager of choice.

On Arch Linux, you can install them with pacman -Syu gcc make.

In theory, this should also compile with other c and c++ compilers, such as the clang compilers or tcc. To change the compilers used, just change the CC and CXX variables in Makefile.

If you're on Windows, I'd suggest installing MSYS2. I believe that once you have that installed (making sure that you selected both the c and c++ compiler to be installed, as well as make) you should be able to build and run this repo through the MinGW shell.

(Though I last set up Windows for c/c++ development like two years ago, so I'm unsure of the exact process)

Building and running

Once you have make, and the GNU Compiler Collection installed, you can build the examples by simply running make clean && make.

Once you've built the programs, you can run ls *_c *_cpp to see all the compiled programs, and the output should look something like the following:

$ ls *_c *_cpp
hashset_c hashset_cpp [...]

The files ending in _c are the c programs (compiled from the corresponding .c file using gcc), and the files ending in _cpp are the c++ programs (compiled from the corresponding .cpp file using g++).

If I wrote the programs correctly, corresponding programs with different suffixes should have the same behaviour.

A quick overview of the differences between C and C++

Note that I mainly have competitive programming experience with C++ (which encourages bad practices like using namespace std; and does not use external libraries or worry about things like future maintainability) and my experience with C is only for hobby projects, and my C is nearly entirely self-taught.

As such, I probably don't have the best practical experience as to the differences between the two, and it's likely the code I'll write will be unidiomatic and possibly run into undefined behaviour...

That being said, here is my understanding of the differences:

Low-level vs mid-level

You might have heard that both C and C++ are low level languages, and here I'm talking about mid-level, what's going on?

Well, I do not really view C++ as a low-level language.

Certainly, C++ allows you to do low-level operations, it can certainly be used as a low-level language, but it contains so many convenience features and abstractions I can't really call it low-level.

Heck, I'm pretty sure in modern C++ you can never touch manual memory management through the use of abstractions like std::unique_pointer and std::shared_ptr and RAII! I mean, there's no garbage collector, sure, but still, you can just create memory, even memory accessed from different places with different lifetimes, and never worry about when to free it!

Furthermore, the standard library is filled with various high-level abstractions, such as iterators, dynamically-sized lists, hashmaps, dequeues, and more!

I mean, I can really go

std::set<int> set{};
set.insert(1);
set.insert(2);
set.insert(1);
set.insert(4);
std::vector vec(set.begin(), set.end());
std::sort(vec.rbegin(), vec.rend());
for (auto elem : vec) {
  std::cout << elem << ' ';
}
std::cout << std::endl;

and not worry about a single memory allocation, I don't have to worry about keeping a binary tree balanced or checking for duplicates, I don't need to worry about the mechanics of iterating through the set or the vector, and I can just sort based on arbitrary iterators over arbitrary types!

To do the equivalent in something like C with equivalent data structures and algorithms would be quite a task indeed, at least a few hours of work, assuming you're only using the standard library.

Furthermore, C's lack of abstractions lead to quite a nice property: You can almost translate a C program into assembly line-by-line, only a few instructions per C operation.

You see a +? Well, that's an add instruction.

You see a for (;;) { ... } loop? That's a simple jmp at the end of the code block.

What about a while (...) { ... }? Super complex? Nope, just check the condition, jump past the end if it's false (jne, je, or something similar), and at the end jump back to the condition check (jmp).

What about a function call, super complex, surely? Nope, just add the arguments to the stack or store them in registers as specified by the C ABI, and then call the function with the call instruction.

Returning? Restore the stack and registers to how they were before the function, move the result either onto the stack or into a register (as specified by the C ABI), and call the ret instruction.

Now, of course, the compiler doesn't quite do this. It could, yes, but modern compilers tend to apply all sort of fancy optimisations. But the principle remains.

Meanwhile, in C++, you see that simple +? It could be a simple add, or it could also be a function call to an overloaded operator+, who knows?

Okay, but surely returning from a function is still simple, right? Nope! Now you must also call the destructors for any stack-allocated variables you have. Implicit function calls, isn't that fun?

Okay, okay, but at least the for (auto elem : container) {...} loop is just a straight-forward abstraction, right? Oh, of course. It's basically just syntax sugar for for (auto iter = container.begin(); iter != container.end(); ++iter) { auto elem = *iter; ... }. But consider this: I'm pretty sure it requires standard library support.

No for loop in C requires standard library support.

Now, none of this makes one better than the other. It's just a difference. I mean, I prefer C for hobby projects, but I'm sure that C++ is better for most big projects, assuming you're not doing something super low-level like Arduino programming or writing an OS kernel.

That being said, C++ isn't quite high-level. It exposes all sorts of low level abilities like manual memory management and bit casting, and all of that quite conveniently.

If you compare it to something like Python or Java or Common Lisp or Haskell, you'll see that it certainly isn't a high-level language.

Language complexity

One nice thing about C++ is all the convenient features it has like namespaces and templating and object-oriented functionality.

However, this comes at a cost.

The C++ language is a complicated beast, with all sort of odd bits off syntax and special rules (if you provide a move constructor for a class, which other constructors should you provide and what are their signatures? Because I sure don't know, I google that stuff every time).

Compared to this, C is a refreshingly straight-forward language with minimal syntax. For example, you don't need to worry about the language requirements for constructors or destructors, nor do you need to know which functions should be noexcept! This is because C doesn't have constructors, destructors, or exceptions. Simple!

Another consequence of this difference in complexity is that C++ programs can take noticeably longer to compile. Even just in these simple examples I've noticed that the C++ programs take longer to compile than the C programs.

Documentation

Another difference, and perhaps this is just laziness or ignorance on my part, is documentation.

If I want the documentation for a C STL function or class, I need an internet connection and then I'll visit cppreference.com.

Now, this really isn't the end of the world, but it requires an interenet connection and a browser and I need to use my mouse and it just isn't a particularly pleasant experience for me, even if it's not exactly unpleasant either.

For C, on the other hand, I have the manpages for the C stdlib downloaded on my laptop, so if I want to see the documentation for printf it's just a simple man printf.3 in my terminal, or :Man printf.3 in Neovim, my preferred text editor.

This is quite convenient for me, as it does not require an internet connection, it does not require me to have a browser open, it does not require me to use my mouse, and it doesn't even require me to leave my terminal or even my text editor!

(And I'm sure a good IDE could help with this, but I really dislike using IDE's, especially those absurdly memory-hungry electron-based beasts which I just despise, so no)

The examples

hello

hello.c and hello.cpp are the standard hello world languages, and are nothing too exotic.

You'll notice both how similar the two programs are, as well as one or two differences that stand out:

  • In C++, we include the <iostream> header for input/output functionality, whereas in C we include the <stdio.h> header.
  • In C++ we use stream output with std::cout and the left-shift (<<) operator, whereas in C we use a simple function puts.

Now, while they are similar in structure, the C++ hello world program actually makes use of multiple C++-specific features not available in C!

namespaces

You'll note that pesky std:: sitting at the start of cout and endl; that is a namespace.

In C++ different functions can sit in various different namespaces. This is very useful for preventing name collision, as the same name can be used in different namespaces for different things. For example:

#include <iostream>

namespace foo {
  void noise() {
    std::cout << "bark!" << std::endl;
  }
}

namespace bar {
  void noise() {
    std::cout << "meow..." << std::endl;
  }
}

int main() {
  foo::noise(); // outputs "bark!"
  bar::noise(); // outputs "meow..."
}

This means that if you use a library in your project, and they use a function name you're already using it doesn't matter! (Assuming, of course, they are using a namespace as they should)

This, however, is not the case in C.

For example, the following is a valid C program:

#include <stdio.h>

void pow(void) {
  puts("POW!");
}

int main() {
  pow();
}

Whereas if you do the following:

#include <stdio.h>
#include <math.h>

// oh no! pow is already defined in math.h!
void pow(void) {
  puts("POW!");
}

int main() {
  pow();
}

You'll get a compiler error about redefining pow, since it is a function supplied by the math.h header.

Operator overloading

In addition to namespaces, the hello world program demonstrates another C++ feature that C lacks: operator overloading.

In C, << is a left bitshift operator and it only works with integer types.

In C++, it is also a left bitshift operator, but it can be overloaded for different types. In this program, we're using the overloaded std::ostream& operator<<(std::ostream&, const std::string), which writes a string to the output using an output stream and then returns a reference to said output stream.

Quite an ugly bit of syntax, in my opinion!

Meanwhile, in C, this operator overloading isn't available, and so we just call the lowly puts function, which prints a string to stdout followed by a newline.

set

set.c and set.cpp is a simple program using sets: We take numbers as input from a user until we hit EOF, and add them to a set, which is a data structure which will contain at most one of any given value. Lastly, we output them in ascending order.

In C++, this is quite simple: We just use the builtin std::set type, the builtin insert method, and then C++'s for each loop to print the numbers in order.

Furthermore, the C++ set is quite an efficient data structure, internally being a balanced binary search tree, resulting in a space complexity of O(n) along with O(log n) time complexity for search, insertion, and deletion.

In C, we have no such data structure supplied by the standard library, and so we must roll our own.

Here I simply use a dynamically-sized array, and before appending a number I first search the array to verify it hasn't been added before. Then, before displaying, I sort it using bubble sort.

Simple enough, but comes at the cost of efficiency: While the space complexity is still only O(n), search, insertion, and deletion are all now O(n)!

Furthermore, while bubble sort is quite a simple algorithm, it unfortunately has quite a bad time complexity of O(n²). I could also have done something like quicksort or merge sort, which would have a better time complexity of O(n log n). In fact, I'll probably update it at some point, I just couldn't remember how to do quicksort at the time, and I didn't want the extra memory allocations required by merge sort...

avl_tree

avl_tree.c and avl_tree.cpp is a continuation of set.*: It is the same basic program, but now using an AVL tree to store the set, giving us our juicy O(log n) search, insertion, and deletion.

(Do note, the implementation is basically entirely out of my head plus a few sketches on a piece of paper to figure the rotations out, after I got the hint from Wikipedia that the balance factor should stay between -1 and 1; as such my implementation might not match Wikipedia's description completely, and could even not be a proper AVL tree... However, I'm relatively confident that it is a self-balancing binary search tree)

The C++ program is actually the same as the C program, as a demonstration of the fact that C++ is (almost) a superset of C:

  • I had to cast the result of malloc to (struct node *) instead of just assigning it directly (although I could also have compiled with -fpermissive and I'd only get a warning for leaving out the cast)
  • And I had to rename all the this parameters to self, as this is a keyword in C++ (where it is the pointer to the object implicitly passed as the first parameter to methods)

And here you see the power of C++'s standard library: a simple std::set<int> in C++ takes about two hundred lines of C, not to mention the hour or two it took to implement and debug the structure.

Also, if you want to have some fun, try compiling avl_tree.c with gcc -DDEBUG avl_tree.c and running ./a.out: After each insertion, the program will now print the structure of the tree, first displaying the values at each node, then displaying the height of each node, and lastly displaying the balance factor (left height minus right height) of each node, which should always be between -1 and 1.

(What do you mean that's not most people's idea of fun? You calling me weird? Okay, fair enough)

Releases

No releases published

Packages

No packages published