The proof given in the wikipedia article uses some basic group action ideas; if you're not familiar with them, then I could try to cook up an algebraic proof. But let me just elaborate on the proof given in wiki:

Let G be the ath power of the cyclic group C_p. Let A be a set with p*a elements. Partition A into sets A_1,A_2,...,A_a, each with p elements. Think of G as acting on A, with each component C_p acting independently on one of the subsets cyclically.

Now consider a different space X: the set of all subsets of A with size p*b. Our group (C_p)^a has a natural action on this space induced by the previously described action: start with a B in X, and let phi in (C_p)^a send B to the set of all phi(b) for b in B. That is, act on each element of B individually, then collect the images together into a new set. It should be clear that the result still has p*b elements. Now we want to find the size of the orbits of this new action (the wiki article states the size without justification; it's not hard, but perhaps a bit confusing). So let B be in X. How many sets does it get sent to under various phi? Consider the intersection of B with each A_i. If the intersection is empty or all of A_i, then different phi don't look different in that component. If the intersection is nonempty, then there are p different images in that component (since the restriction of the action to each A_i is regular). So let k denote the number of i such that B intersect A_i is nonempty and not all of A_i (the article calls these incomplete rings), and we have that the size of the orbit of B is p^k. (We do this for each B.)

Okay, we're finally ready to get what we want. Basic counting techniques tell us that the size of X is C(pa, pb). Group action theory tells us that this is the sum of the sizes of the orbits of any action on X. So we have that C(pa, pb)=sum(p^k), where the sum is taken over all orbits of the action described above, and k is the number defined above for some (any) B in the orbit. Modulo p^2, the only surviving terms in the sum come from those with k=0 and those with k=1.

If the orbit of B has k=0, then each intersection of B with an A_i is either empty or all of A_i. There are of course C(a, b) of these (just choose b of the sets A_i to be full).

The orbit of B cannot have k=1, since |B| is divisible by p.

Hence the sum has C(a, b) terms which are 1, and the rest are zero modulo p^2, and we have Babbage's theorem.

Let me know if there's anything I can elaborate on here to help.

EDIT: Ah, gianlino seems to have an algebraic proof.

[gianlino, I can't figure out why toward the end you say that we only care about the 1 and the x^p from the term (1+x)^p...don't we need to at least concern ourselves also with the px and px^(p-1) terms?]

EDIT2: Oh, I see. I misread the phrase "all other exponents are multiples of p" as meaning any x^blah was p*x^blah. You're correct.