Float<>I don't understand how the algorithm that allows the conversion of fixed decimal values works. I would like to know the reason why the method shown below works.
Question detailsIn the first part of this block, the situation is explained using actual code to make the question easier to understand.
The second part explains what you don't understand.
When converting a floating decimal value to a fixed decimal value and holding it, it seems that the following algorithm is generally used.
For the sake of clarity, the conversion shall be performed according to the following rules.
・The fractional part of the fixed decimal value is the lower 8 bits.
・Fixed decimal values are stored in int type variables (fix).
・Floating point numbers are float type variables (floats).
Floating point → fixed point conversion algorithm
int float_to_fixed (float floats)
{
int ret;
ret = (int) (roundf (floats * (1<<8)));// Given floating point number ☓ 2 ^ (bit number of fixed decimal part)
return (ret);
}
Fixed decimal → floating point conversion algorithm
float fixed_to_float (int fixed)
{
float ret;
ret = ((float) fixed/(float) (1<<8));// Cast a fixed minority value to float and divide by 2 ^ (the number of bits in the fractional part of the fixed decimal value)
return (ret);
}
References
1: https://embeddedartistry.com/blog/2018/07/12/simplefixedpointconversioninc/
2: https://medium.com/incrediblecoder/convertingfixedpointtofloatingpointformatandviceversa6cbc0e32544e
I don't know why floating point can be converted to fixed decimal in this way for the two functions shown above.
Here's why I don't understand.In my understanding, float and fixed decimal have completely different structures when considering bits.
Floating decimal values are divided into a sign part, an exponent part, and a mantissa part. However, fixed decimal numbers have bits divided into a sign part, an integer part, and a decimal part. So, no matter how much bit shift you make, you will think that floating point cannot be converted to a fixed decimal. This is because the roles of each bit are completely different in the first place.
It is understandable to convert an integer value such as an int type to a fixed decimal value.
Integer values can be understood as having a binary point (decimal point) to the right of the first bit, so if I want to fix to a fixed decimal, you only have to shift the bit to the left by the decimal part.
Did you convey the intent of the question?
Please lend me your help.

Answer # 1

Answer # 2
Bitshifted, cast to integer and truncated after the decimal point
Then, if you return the shift to the original position again, the shift amount or less will be reset to zero, isn't it?
Related articles
 in c ++, the value of the initializer is checked and if it is ng, an exception is thrown, but the exception cannot be caught
 i want to create a 3d array with c ++ and convert it to an image
 c ++  i want to know the cause of the zaxis value being incorrect in the matrix used for screen drawing in glsl
 i want to not convert when the value is larger than the maximum value of integer type or smaller than the minimum value
 c++  when assigning a numerical value to a floating point type, why add a decimal point such as 10?
 how to convert javascript octal string to decimal string
 the program to convert to doubleprecision floating point format does not work
 c ++  no value is assigned to the array
 i want to convert the value entered in the javascript form to json and display it on the confirmation screen
 How to convert decimal numbers to IP addresses in Python
 convert java form information to dto value and refill
 how to convert vector to tuple in c ++
 c ++  the pow function does not work well, and a mysterious value is entered in set
 how to convert decimal number to n number
 in c ++, a value different from the value assigned to the variable from standard output is output
 c ++  i want to set an initial value for a static member variable
 c ++  questions about passing by value and passing by reference
 c ++  even if i put a value in the array obtained by the data () method of vector, size is not updated
 c ++  about the return value of the constructor
 resolution of unknown type name'a', which is a c ++ error
 c #  why do you need prime numbers for programming learning?
 c ++  i want to change the screen brightness by briefly pressing the power button on the m5stick c
 c ++  about imgui's imguitreenodeflags_notreepushonopen
 c ++ about merge sort
 c ++  i want to port the functionalized fizzbuzz implemented in ruby to c language
 c ++  i get fatal error: bits/error_constantsh: no such file or directory when compiling with arduino ide
 i want to know the reason for killing when declaring an array in the main function (c ++)
 input c ++ file input method
I agree
I agree
That's exactly right.
Bitshifting a floatingpoint number type does not give it the correct form as a fixed decimal type.
Now,
Is this a bit shift in floating point numbers, isn't it?
Floating point to integer
256 (= 1<<8)
It just multiplies and rounds the decimal part and casts it to an integer.