ungetc is only guaranteed to take one byte of pushback. On the other hand, I've tested it on Windows and Linux and it seems to work with two bytes.
Are there any platforms (e.g. any current Unix systems) on which it actually only takes one byte?
The C99 standard (and the C89 standard before that) said unequivocally:
One character of pushback is guaranteed. If the
ungetcfunction is called too many times on the same stream without an intervening read or file positioning operation on that stream, the operation may fail.
So, to be portable, you do not assume more than one character of pushback.
Having said that, on both MacOS X 10.7.2 (Lion) and RHEL 5 (Linux, x86/64), I tried:
#include <stdio.h>
int main(void)
{
int i;
for (i = 0; i < 4096; i++)
{
int c = i % 16 + 64;
if (ungetc(c, stdin) != c)
{
fprintf(stderr, "Error at count = %d\n", i);
return(1);
}
}
printf("No error up to count = %d\n", i-1);
return(0);
}
I got no error on either platform. By contrast, on Solaris 10 (SPARC), I got an error at 'count = 4' — and the same on Solaris 11.3. Worse, on HP-UX 11.00 (PA-RISC), HP-UX 11.23 (Itanium), and HP-UX 11.31 (Itanium), I got an error at 'count = 1' - belying the theory that 2 is safe. Similarly, AIX 6.0 (and 7.2) gave an error at 'count = 1'.
So, AIX and HP-UX only allow one character of pushback on an input file that has not had any data read on it. This is a nasty case; they might provide much more pushback capacity once some data has been read from the file (but a simple test on AIX adding a getchar() before the loop didn't change the pushback capacity).
In December 2023, the program above failed at count = 1 on Windows Server 2016 Standard using MSVC 19.15.26730 for x64. This is different from what rwallace found.
fseek, if your file is seekable) if the second one returns failure.1u<<30 reports no error on macOS 15.3.2.There are some posts here suggesting that it makes sense to support 2 chars for the sake of scanf.
I don't think this is right: scanf only needs one, and this is indeed the reason for the limit. The original implementation (back in the mid 70s) supported 100, and the manual had a note: in the future we may decide to support only 1, since that's all that scanf needs. See page 3 of the original manual (Maybe not original, but pretty old.)
To see more vividly that scanf needs only 1 char, consider this code for the %u feature of scanf.
int c;
while isspace(c=getc()) {} // skip white space
unsigned num = 0;
while isdigit(c)
num = num*10 + c-'0',
c = getc();
ungetc(c);
Only a single call to ungetc() is needed here. There is no reason why scanf needs a char all to itself: it can share with the user.
scanf() only requires one character of pushback. See C11 §7.21.6.2 The fscanf() function — footnote 285: fscanf pushes back at most one input character onto the input stream. Therefore, some sequences that are acceptable to strtod, strtol, etc., are unacceptable to fscanf.Implementations which support 2 characters of pushback probably do so in order that scanf can use ungetc for its pushback rather than requiring a second nearly-identical mechanism. What this means for you as the application programmer is that even if calling ungetc twice seems to work, it might not be reliable in all situations -- for example, if the last operation on the stream was fscanf and it had to use pushback, you can probably only ungetc one character.
In any case, it's nonportable to rely on having more than one character of ungetc pushback, so I would highly advise against writing code that needs it...
scanf() only requires one character of pushback. See C11 §7.21.6.2 The fscanf() function — footnote 285: fscanf pushes back at most one input character onto the input stream. Therefore, some sequences that are acceptable to strtod, strtol, etc., are unacceptable to fscanf.scanf can only push back one character but after that you're still allowed to call ungetc, producing two characters of pushback total.fscanf function" (p154): An implementation must not use the ungetc function to perform the necessary one-character pushback. In particular, since the unmatched text is left “unread,” the file position indicator as reported by the ftell function must be the position of the character remaining to be read. […continued…]ungetc, the pushback in fscanf could not affect the pushback stack in ungetc. A scanf call that matches N characters from a stream must leave the stream in the same state as if N consecutive getc calls had been made.None of the existing answers consider that the number of characters that can be pushed back may depend on context, rather than be a fixed amount per implementation, because it might simply be pushed into the stream buffer.
A simple way to implement pushback is to re-add the character to the stream buffer. There is always room for at least one character: In the preceding getchar or equivalent operation (including, say, a scanf that called getchar or did some operation to get a character), a character was removed from the buffer. All that is necessary to “put back” a character is to write it into the buffer and adjust the buffer accounting fields (saying how many characters are in the buffer and/or where the end of the pending data is). In the worst case, there were no characters in the buffer when the previous getchar was called, so it had to perform a read from the file to fill the stream, and then it returned a character, leaving one empty space in the buffer. One ungetc will work, and the next will fail because the buffer is full.
In the best case, the last getchar emptied the buffer, and many ungetc calls can be made successfully, until the buffer is full again. So the number of characters of pushback depends on where one happens to be in the stream buffer.
Note that an operation such as fseek would discard the buffer contents (mark the buffer as empty) and change the file location data but not perform a new read, so the buffer would be empty. If a getchar or other read function were called, the stream would perform a file read to fill the buffer. If an ungetc were called, the buffer is read to accept characters.
scanfcan simply useungetcfor its pushback rather than requiring a separate mechanism.scanfto do a great job in all cases. In fact, even 2 is not enough. For reading integers, 1 is plenty. But suppose you want to read floating point numbers like1.5e-9. Now consider what happens when you get an input "number" like this:1.5e-q. Eventually scanf will read theqand think to itself "I thought this was a float in scientific notation, but it's not; I should stop here". It will un-get theqand "return" 1.5 to the caller. But thee-is gone forever, and ideally it should not be, I think.