C strings
Factor handbook » C library interface » Passing data between Factor and C

Prev:Manual memory management
Next:Output parameters in C


C string types are arrays with shape { c-string encoding }, where encoding is an encoding descriptor. The type c-string is an alias for { c-string utf8 }. See Encoding descriptors for information about encoding descriptors. In TYPEDEF:, FUNCTION:, CALLBACK:, and STRUCT: definitions, the shorthand syntax c-string[encoding] can be used to specify the string encoding.

Using C string types triggers automatic conversions:
Passing a Factor string to a C function expecting a c-string allocates a byte-array in the Factor heap; the string is then encoded to the requested encoding and a raw pointer is passed to the function. Passing an already encoded byte-array also works and performs no conversion.
Returning a C string from a C function allocates a Factor string in the Factor heap; the memory pointed to by the returned pointer is then decoded with the requested encoding into the Factor string.
Reading c-string slots of STRUCT: or UNION-STRUCT: returns Factor strings.


Care must be taken if the C function expects a pointer to a string with its length represented by another parameter rather than a null terminator. Passing the result of calling length on the string object will not suffice. This is because a Factor string of n characters will not necessarily encode to n bytes. The correct idiom for C functions which take a string with a length is to first encode the string using encode, and then pass the resulting byte array together with the length of this byte array.

Sometimes a C function has a parameter type of void*, and various data types, among them strings, can be passed in. In this case, strings are not automatically converted to aliens, and instead you must call one of these words:
string>alien ( string encoding -- byte-array )

malloc-string ( string encoding -- alien )


The first allocates byte-arrays, and the latter allocates manually-managed memory which is not moved by the garbage collector and has to be explicitly freed by calling free. See Byte arrays and the garbage collector for a discussion of the two approaches.

The C type char* represents a generic pointer to char; arguments with this type will expect and return aliens, and won't perform any implicit string conversion.

A word to read strings from arbitrary addresses:
alien>string ( c-ptr encoding -- string/f )


For example, if a C function returns a c-string but stipulates that the caller must deallocate the memory afterward, you must define the function as returning char* and call (free) yourself.