## String to floating-point implementation

While working on an implementation of a scanf-like method for an input stream class, I realized that I needed to also implement a string to floating-point function to be used by this method for converting the read floating-point input string. Like my stoi template function, I created this stof template function that can convert a string to a float, double, or long double. While this function does detect overflows and underflows, it doesn’t do so perfectly and may sometimes miss some of those errors, resulting in a value of INF or NaN.

#include <cctype> #include <cstring> #include <cstdio> #include <cstdlib> #include <stdexcept> #include <cfloat> #include <iostream> using namespace std; /* float, double, or long double */ template <typename type> type stof(const char* s) { int i; type \ ret = 0, //Final return value num = 0, //accumulator place = 1; //digit place 10's value bool neg = false; //Negative or positive number const char *t; bool pineapple = false; //Tells if we are calculating the whole or fractional part while (isspace(*s)) s++; //Skip whitespace if (*s == '-') { neg = true; s++; } //Check for negative else if (*s == '+') s++; if ((t = strchr(s, '.')) == NULL) //Find first digit of the whole number part if ((t = strchr(s, 'e')) == NULL) t = s + strlen(s); for (t--; t >= s; t--, place *= 10) { //Calculate the whole number part goto stof_smoothie; stof_forloop_1: ; } //Skip if no decimal point if ((t = strchr(s, '.')) == NULL) goto stof_blend_ice; //Calculate the fractional part pineapple = true; //Loop through each number after decimal point. //I didn't use isdigit in the for loop compare because //later on i have to check if it's a valid digit //and poop out an error if it's not. for (t++, place = 0.1; *t != 'e' && *t != 0; t++, place /= 10) { goto stof_smoothie; stof_forloop_2: ; } stof_blend_ice: if ((t = strchr(s, 'e')) == NULL) return ret; //Multiply by 10^x if suffixed with 'e' else { //Where x comes right after e i = atoi(t + 1); //Divide by 10s if negative if (i < 0) { for (; i < 0; i++) { num = ret / 10; //Dividing should make a smaller number, if not...there was an underflow if (num >= ret) throw underflow_error("underflow"); else ret = num; } } else { //Multiply by 10s if positive for (; i > 0; i--) { num = ret * 10; //Multiplying should make a bigger number, if not...there was an overflow if (num <= ret) throw overflow_error("overflow 0"); else ret = num; } } } return ret; //Performed in both loops where the whole and fractional //parts are calculated. I know goto is considered bad //but this time it's just genius. I save more space this //way stof_smoothie: //Not sure if this applies to floating numbers //but place value would loop back to 0 if there //was an over/underflow if (!place) if (neg) throw underflow_error("underflow"); else throw overflow_error("overflow 1"); //Validate and convert digit if (!isdigit(*t)) throw invalid_argument("Invalid string"); num = (*t - '0') * place; //Basically, it's an overflow or underflow if the result is not logical if (neg) { if ((type)(ret - (num ? num : place) > ret)) throw underflow_error("underflow"); ret -= num; if (ret >= 0 && num != 0) throw underflow_error("underflow"); } else { if ((type)(ret + (num ? num : place) <= ret)) throw overflow_error("overflow 2"); ret += num; if (ret <= 0 && num != 0) throw overflow_error("overflow 3"); } if (pineapple) goto stof_forloop_2; else goto stof_forloop_1; } int main() { cout << stof<float>("3.14e-8") << endl; }

The fundamental flaw of this algorithm is that it uses floating point math, so it cannot be accurate due to rounding errors, also limit checking is inaccurate. I proposed a strict algorithm (only integer math involved) with precise limit checking in my article: http://krashan.ppa.pl/articles/stringtofloat.

Interesting, looks likes you put a lot of work into your algorithm. I tested it on a little-endian machine and it comes out with the same results as my function. FLT_MIN and FLT_MAX also come out with same accurate values for both functions. So, maybe you can point out which string instance would make my function return an inaccurate value, I’m unable to find one.

Sure. I’ve used double precision. It is guarranted to provide 16 valid significant digits. Then I’ve added “cout.precision(16)” in main(), copypasted my code into your one to have both functions available, renamed my atof() to atof2(), so there is no conflict with standard library.

Here is my main():

cout.precision(16);

cout << stof(“3.141592653589793e+307”) << endl;

cout << atof2("3.141592653589793e+307") << endl;

The number used has 16 significant digits. On my machine with GCC compiler I get following results:

3.141592653589791e+307

3.141592653589793e+307

Then I went further and decided to check exact hex values of returned double numbers. After some typecasting like:

unsigned long long int *z;

double p;

p = stof(“3.141592653589793e+307”);

z = (unsigned long long int*)&p;

cout << hex << *z << endl;

I've got that our functions return following values:

your stof(): 0x7fc65e6f105a3049

my atof2(): 7fc65e6f105a304c

What is more interesting I’ve tried standard library atof(), which returns 0x7fc65e6f105a3050. Then I’ve decided to calculate the real value of all three results using ttmath online calculator with 512 bit mantissa. Dissecting doubles we get:

Sign bit in all three results is of course cleared. Exponent in all three results is 2044, 1021 after subtracting offset. 52-bit significands are:

1792680968990793 – for your stof()

1792680968990796 – for my atof2()

1792680968990800 – for stdlib atof()

Then I apply the formula “x = 2^exp * (1 + significand * 2^-52)” to get the real values, using http://www.ttmath.org/online_calculator with 512-bit precision. I’ve placed a vertical bar after 16 significant digits:

3.141592653589791|3220 e+307 – for your stof(), error is -1.678e+292

3.141592653589792|8188 e+307 – for my atof2(), error is -0.181e+292

3.141592653589794|8147 e+307 – for stdlib atof(), error is +1.814e+292

Then, my code provides the most accurate conversion possible (changing the significand by +1 or -1 increases the modulo of error). To my surprise, my code beats even standard library function (and in fact your code beats it too, by small factor). It may be however my standard library is not so modern (I’ve used GCC 4.4.5 compiler).

BTW I see that the blog engine has eaten stof() template types (as suspicious HTML tags, I suppose). Of course I’ve used double type in all the code above.

ADDED: It turned out that standard library atof() errors only happened for some obscure stdlib version I’ve used. After I’ve changed the standard library, atof() results are identical to my code results, at least for a few values I’ve checked. Then my claim that “my code beats even the standard library” is false in general.