Export (0) Print
Expand All

Writing POSIX-Standard Code

Microsoft Windows Services for UNIX 3.0

Microsoft Consulting Services

On This Page

Overview Overview
POSIX Overview POSIX Overview
Benefits of POSIX Benefits of POSIX
POSIX Application Conformance POSIX Application Conformance
POSIX Implementation on NT POSIX Implementation on NT
INTERIX Architecture and Win32 Subsystems INTERIX Architecture and Win32 Subsystems
POSIX and INTERIX POSIX and INTERIX
Porting Code Porting Code
Porting Applications Porting Applications
Macros and Constants Macros and Constants
Macros in INTERIX Macros in INTERIX
Macros Usage for include File Macros Usage for include File
LAB 1: Macros Usage LAB 1: Macros Usage
System Information System Information
Lab 2: System Information Lab 2: System Information
SIGNALS SIGNALS
Signals in POSIX Signals in POSIX
Signals in INTERIX Signals in INTERIX
LAB 3: Signals in POSIX LAB 3: Signals in POSIX
LAB 4: Using Signals with Masks LAB 4: Using Signals with Masks
Signals in INTERIX Signals in INTERIX
Sockets in POSIX Sockets in POSIX
Server Client Communication Server Client Communication
LAB 5: Socket Programming LAB 5: Socket Programming
System V IPC in POSIX System V IPC in POSIX
Message Queues Message Queues
LAB 6: Message Queues LAB 6: Message Queues
Semaphores Semaphores
Semaphores: System Calls Semaphores: System Calls
LAB 7: Semaphores LAB 7: Semaphores
Shared Memory Shared Memory
LAB 8: Shared Memory LAB 8: Shared Memory
BSD Strings and Memory Functions BSD Strings and Memory Functions
Copying and Concatenation Functions Copying and Concatenation Functions
String Length functions String Length functions
String/Array Comparison Functions String/Array Comparison Functions
Collation Functions Collation Functions
Search Functions Search Functions
Finding Tokens in a String Finding Tokens in a String
Functions not Supported by INTERIX Functions not Supported by INTERIX

Overview

  • POSIX environment and benefits

  • Differences between compile time macros and manifest constants in POSIX and usage

  • System information in POSIX.1

  • Signals in POSIX.1 Extensions

  • Sockets programming

  • System V IPC mechanisms-shared memory, message queues and semaphores

  • Usage of BSD strings and memory functions

Objectives

This session provides a brief introduction to POSIX and some of the issues for writing code conforming to POSIX.1 environment, with emphasis on INTERIX as the development environment.

What you will learn

At the end of this session, you will learn:

  • How to use compile time macros and manifest constants in POSIX

  • How to extract system information in POSIX applications

  • How to write POSIX applications using sockets and signals and port POSIX applications from UNIX

Recommended Reading

The following references are useful in understanding the POSIX.1 programming environment:

Lewine, Donald. POSIX Programmer's Guide.

O'Reilly, 1991. ISBN 0-937175-73-0

Stevens, W. Richard. Advanced Programming in the UNIX Environment.

Addison-Wesley, 1992. ISBN 0-201-56317-7

Zlotnick, Fred. The POSIX.1 Standard: a programmer's guide.

Benjamin/Cummings, 1991. ISBN 0-8053-9605

POSIX Overview

What is POSIX?

  • Standard environment to enable portability of applications software

  • Achieved through specifications and conformance

  • Implemented as subsystems on NT platform

  • POSIX.1 defines a C language source code-level application programming interface (API) to an operating system environment

POSIX Systems

POSIX was designed as a standard environment to enable the portability of software applications. The portability of a software application is achieved through the specification of a set of services that every POSIX conforming application can expect to exist on a conforming platform. For a standard to be of practical benefit, there has to be a method of measuring adherence to its requirements.

What Is POSIX?

POSIX stands for Portable Operating System Interface for computing environments. POSIX began as an effort by the IEEE community to promote the portability of applications across UNIX® environments by developing a clear, consistent, and unambiguous set of standards. However, POSIX is not limited to the UNIX environment. It can also be implemented on non-UNIX operating systems, as was done with the IEEE Standard 1003.1-1990 (POSIX.1) Implementations on Virtual Memory System (VMS), Multiprogramming Executive (MPE), and the Conversion Technology Operating System (CTOS).

POSIX actually consists of a set of standards that range from POSIX.1 to POSIX.12.

Benefits of POSIX

  • Significant reduction in cost and effort for porting

  • Problems in porting code

    • Coding habits

    • Human errors

  • Benefits of using POSIX for porting code

    • Similar development environment to UNIX

    • Broad range of familiar services and tools

    • Native OS support for development

POSIX provides software developers with the opportunity for a significant reduction in cost and effort when porting applications to different platforms.

Given the benefits what stands in the way of creating POSIX conforming applications? There are two main reasons why applications fail to conform to POSIX requirements:

  • Know-how and old habits.

  • Once the preceding problems are overcome, new problems are caused by human oversight and error.

Because of the broad range of services offered by POSIX, it can take some time for developers to be familiar with POSIX. Old Unix programming habits and know-how are easily transferred to a POSIX development environment. However, programmers cannot be expected to be familiar with all the intricacies of POSIX and how it differs from what they are familiar with.

Speaking Unix with a POSIX accent will not solve portability problems, particularly in proprietary platforms that support POSIX. It is necessary to speak POSIX as a native language, and if using Unix, perhaps with a Unix accent. Training can go someway towards ensuring a smoother transition to a POSIX-only environment.

POSIX Application Conformance

  • System to have POSIX.1 Conformance

    • support all of the interfaces as defined in the ISO/IEC 9945-1

    • POSIX.1 Conformance Document (PCD)

    • Pass the NIST Test suite

  • Four categories of compliance

    • Strictly conforming POSIX.1 Applications

    • Applications conforming to ISO/IEC and POSIX.1

    • Applications conforming to POSIX.1 and <National Body>

    • POSIX.1-conforming applications that use extensions

POSIX Conformance

For a system to be given a certificate of POSIX.1 conformance, it must meet the following requirements:

  • The system must support all of the interfaces as defined in ISO/IEC 9945-1.

  • The vendor must supply a POSIX.1 Conformance Document (PCD) with their implementation as specified in ISP/IEC 9945-1.

  • The implementation must pass the appropriate National Institute of Standards and Technology (NIST) test suite.

Application Compliance to POSIX.1

Many people talk about a "POSIX-compliant" application, but what does that really mean? For POSIX.1, there are four categories of compliance, ranging from a very strict compliance to a very loose compliance.

The various categories of compliance are outlined in the following subsections.

Strictly conforming POSIX.1 applications

A strictly conforming POSIX.1 application requires only the facilities described in the POSIX.1 standard and applicable language standards. A strictly conforming POSIX.1 application:

  • Does not rely on any behavior described in ISO/IEC 9945-1 as unspecified or implementation-defined.

  • May only use facilities described in the standard. However, because the behavior of some of those facilities may vary across implementations, such an application may need to be modified to run on different platforms.

This is the strictest level of application conformance. Applications at this level should be able to move across implementations with just a recompilation.

Applications conforming to ISO/IEC and POSIX.1

An ISO/IEC-conforming POSIX.1 application is one that uses only the facilities described in ISO/IEC 9945-1 and approved conforming language bindings for ISO or IEC standards. This type of application must include a statement of conformance that documents all options and limit dependencies, and all other ISO or IEC standards used.

This level of conformance is not as strict as the previous one for two reasons. First, it allows a POSIX.1 application to make use of other ISO or IEC standards, such as Graphical Kernel System (GKS). Second, it allows POSIX.1 applications within this level to require options or limit values beyond the minimum. For example, such an application could require that the implementation support filenames of at least 16 characters. The POSIX.1 minimum is 14 characters.

Applications conforming to POSIX.1 and <National Body>

A <National Body>–conforming POSIX.1 application differs from an ISO/IEC-conforming POSIX.1 application because this type of application may also use specific standards of a single ISO/IEC organization, such as the American National Standards Institute (ANSI) or British Standards Institute (BSI). This type of application must include a statement of conformance that documents all options and limit dependencies, and all other <National Body> standards used.

POSIX.1-conforming applications that use extensions

A conforming POSIX.1 application using extensions is an application that differs from a conforming POSIX.1 application because it uses nonstandard facilities that are consistent with ISO/IEC 9945-1. Such an application must fully document its requirements for these extended facilities.

POSIX Implementation on NT

posix05

POSIX Subsystem

The POSIX subsystem is implemented in Windows NT as a protected server. POSIX applications communicate with the POSIX subsystem through a message-passing facility in the Executive known as a Local Procedure Call (LPC).

The POSIX subsystem, as well as each POSIX application, runs in its own protected address space that protects it from any other application that might be running on Windows NT. POSIX applications are preemptively multitasked with respect to each other and to other applications running in the system.

The high level technical overview

  • All of the services on Windows NT are provided by subsystems: some functional and some environmental.

  • An environment subsystem presents the user and programmer interfaces.

  • Using an environment subsystem architecture means that the INTERIX subsystem can provide the exact behavior of a UNIX OS, rather than a Win32 library emulation that can only provide an incomplete Win32 view of the UNIX functionality. This is the key difference between Microsoft's INTERIX and any of the Win32 library solutions (RedHat/Cygnus cygwin32, David Korn's U/Win, and MKS/Datafocus NutCracker.)

  • Exact behavior means an easier and more exact port and the result behaves correctly.

INTERIX Architecture and Win32 Subsystems

posix06

INTERIX and Win32 subsystems

INTERIX is a kernel integrated environment subsystem for migrating applications from UNIX systems to Windows NT/2000. It requires source code such that the application can be rebuilt to be a native NT process that runs at native performance. The simplest way to think about it is what if the POSIX subsystem was real. The INTERIX environment subsystem replaces the original Microsoft POSIX subsystem with something far more functional and robust, useable, and integrated with the rest of the NT environment.

The simple architecture description

Everything above the NT/2000 kernel is a subsystem. There are functional subsystems (e.g. I/O, security) and environment subsystems (Win32, the historical Microsoft POSIX subsystem, the INTERIX subsystem).

There is sufficient granularity in the kernel interface that by dropping down different environment subsystems, one can turn NT into any sort of machine one desires. Microsoft turned NT into a Windows machine with the Win32 environment subsystem. You can run many environment subsystems concurrently. Environment subsystems are the way NT was designed.

Every application process (with very few exceptions) runs as a client of an environment subsystem. So when you run Microsoft Word, it is a client of the Win32 environment subsystem. When Word opens a file on disk, there is a lot of fast LPC happening between the process (Word), the Win32 environment subsystem, the security subsystem (e.g. to authenticate access to the file), and the I/O subsystem (to walk the driver stack to the file on disk).

The INTERIX subsystem does the same thing. When a vi edit process reaches for a file on disk, it is performing all the same fast LPC to the INTERIX environment subsystem, the security subsystem, and the I/O subsystem to get to the file on disk. The impact of "another" subsystem is minimal. It's just one more subsystem in an architecture that is subsystem centric.

POSIX and INTERIX

  • Interix is a POSIX.1 system with extensions taken from both BSD and System V

  • Applications are often written for specific operating systems and then ported to other using

    • some combination of conditional compilation (#ifdef preprocessor commands)

    • Or wrapper function

  • Porting to Interix

    • Making the source portable

    • Dealing with issues specific to Interix

  • When porting a new program, examine the application for features that aren't yet supported in the Interix subsystem

The INTERIX subsystem is a POSIX.1-conformant subsystem. It supports sockets, BSD 4.4 interfaces and the System V IPC mechanisms, pseudo terminals, memory mapped files, and a wealth of other functionality. Although INTERIX is a POSIX.1-conforming system, it takes different approaches in some areas than traditional systems, and there are differences outside the areas defined by the standard.

Applications are often written for specific operating systems and then ported to others, using some combination of conditional compilation (#ifdef preprocessor commands) or wrapper functions. Porting such software to INTERIX involves two steps:

  • Making the source portable. Making the source portable is actually choosing the features supported by INTERIX, which includes the POSIX.1 interface and some extensions taken from other standards (such as the Single UNIX Specification) and from traditional systems, such as BSD or System V derivatives.

  • Dealing with issues specific to INTERIX.

The usual method for porting an application to INTERIX is to move the source to an INTERIX system and then recompile. If a Makefile exists, porting may be as simple as typing make. In the absence of Makefile, the xmkmf utility on INTERIX can be used to create a Makefile.

Also, Imake is a C preprocessor interface to the make utility and used to generate Makefiles from a template, a set of cpp macro functions, and a per-directory input file called an Imakefile.

This allows machine dependencies (such has compiler options, alternate command names, and special make rules) to be kept separate from the descriptions of the various items to be built.

Porting Code

Code Portability Strategies

  • Write custom library of functions for target platform

  • Define platform-specific macros to resolve to correct functions

  • Complicated code using #ifdef statements to write code for multiple platforms

    • Versioning issues arise

Strategies for porting code

There are several basic approaches to port code. Most applications use more than one of these ways.

  1. Writing a custom library of functions for each platform. The application always calls the private version of the function, which is linked to a platform-specific library. This is a great deal of work, though it may be the appropriate method for a large body of source.

    For example, a custom library function, func1(), (not a system call) for the application to be ported can be implemented in different libraries, say func_NT.lib and func_UX.lib. When compiling on a particular system, the appropriate library is linked.

  2. Defining an extensive set of platform-specific macros that expand to the correct functions and names. For example,

    func1_ux() //function to be used in the UNIX environment
    func1_nt() //function to be used in the WinNT environment
    #ifdef _INTERIX
    #define FUNC1 func1_nt()
    #else
    #define FUNC1 func1_ux()

    Note: The _OPENNT macro is deprecated in Services for UNIX 3.0 and may not be supported in future releases. Use _INTERIX instead.

  3. Using #ifdef statements to isolate sections of code based on the platform. For example, INTERIX supports POSIX.1, ANSI/ISO C, and many interfaces from both historical BSD and SVR4 systems. Older code written using #ifdefs may make assumptions about the platform that are not valid. For example, code built around #ifdef BSD will usually try to include <sgtty.h> rather than <termios.h>. By labeling blocks of code with the platform name, you are often trying to hit a moving target; for example, BSD4.4 has some different APIs than BSD4.3, but they are both BSD

Porting Applications

Porting Application Process

  • Most common changes required for porting are:

    • Selecting the appropriate compile flags for a POSIX.1 system

    • Replace the varargs macros with stdarg macros

    • Replace references to absolute path names (such as /bin/sh) with INTERIX-style path names, either by using confstr() or by using the INTERIX routines supplied

    • Isolate symbolic links with #ifdef S_IFLNK. When symbolic links are implemented, this macro will be defined in <sys/stat.h> and the code will be included at the next compliation

Porting process

The typical porting process is to type make in the source directory and then fix the problems as the compiler reports them. The most common changes required are:

  • Selecting the appropriate compile flags for a POSIX.1 system.

  • Replacing the varargs macros with stdarg macros.

Macros and Constants

  • POSIX specifies its own set of compile-time macros and manifest constants, not defined in either System V or BSD systems

  • Programs written in a BSD environment often include <sys/param.h>

  • The INTERIX SDK provides this header file—it includes <sys/types.h> and <limits.h>.

  • Change manifest constants in your code

Compile-time Macros and Manifest Constants

POSIX specifies its own set of compile-time macros and manifest constants, not defined in either System V or BSD systems. These macros and constants are unique to POSIX and are defined in the specified include files.

  • Programs written in a BSD environment often include <sys/param.h> include file.

  • The INTERIX SDK provides this header file. It includes <sys/types.h> and <limits.h>.

Defined Symbols in INTERIX

The c89 compiler defines the following macro symbols to be 1: _POSIX_ and __INTERIX. The _POSIX_ symbol is a reserved symbol, and should not be used or modified. The cc utility defines the symbols __INTERIX and unix.

Note: The "unix" macro was defined because many application sources, intended to compile on multiple platforms, use this simple macro to call out features found on UNIX systems, as opposed to DOS or VMS.

The c89 compiler by default passes the /Za option to the Microsoft compiler, which defines __STDC__, unless you specify -N nostdc.

The cc utility by default passes the /Ze option and the Microsoft compiler does not define __STDC__.

Isolating INTERIX-Specific Features

If you want to add or isolate features specific to INTERIX, you can surround it with a #ifdef __INTERIX. This macro is automatically defined by cc and c89.

Macros in INTERIX

  • POSIX.1 defines a macro_POSIX_SOURCE that tightly controls the namespace. Don't use it.

  • It is defined in the standard to force the standard defined name space

  • Define _ALL_SOURCE

  • Used by various UNIX vendors to open up the name space to allow it to be seen to build/port applications

  • Interix uses it as well!

Unless you absolutely know that a program builds in a pure POSIX.1 environment, it will cause all sorts of compile time breaks depending upon what the application source does.

INTERIX uses the _ALL_SOURCE macro to control the namespace. INTERIX also defines the __INTERIX macro.

Sample code:

An example of macro usage is listed below:

#elif defined(__INTERIX)
#define     S_ISLNK(m) (0)
#define     USE_FCNTL_SERIALIZED_ACCEPT
#undef     HAS_GMTOFF
#define     NO_SETSID
#define     JMP_BUF sigjmp_buf
#include <sys/time.h>
#define     getwd(d)      getcwd(d,MAX_STRING_LEN)

Macros Usage for include File

  • Included header files are structured to coexist with single UNIX specification

  • Separate files to define

    • POSIX.1 functions eg. String.h

    • Single UNIX specification functions eg. Strings.h

  • Use appropriate macro to define the API namespace

    • _POSIX_SOURCE—POSIX namespace

    • _ALL_SOURCE—all APIs provided with INTERIX

    • To be specified before the first header file include

The INTERIX header files are structured to align with the Single UNIX Specification. For example, the string and memory functions that occur in POSIX.1 are in <string.h>, while those which are in the Single UNIX Specification but not in POSIX.1 are in <strings.h>.

The include files are also structured to restrict the API namespace. If you define the macro _POSIX_SOURCE to be 1 before the first header file is included, your program is restricted to the POSIX namespace. It will contain only those APIs specified in the POSIX standards. You may find this restrictive.

Sample code

To get all of the APIs provided with INTERIX, #define _ALL_SOURCE as 1 before the first header file is included. For example:

#define _ALL_SOURCE 1
#include <unistd.h>

LAB 1: Macros Usage

  • Run the INTERIX Kom shell.

  • Change to the root directory, using the cd /command.

  • Check that you are in the root directory with the pwd command

Compile the following program using the following steps.

  1. Run the INTERIX Korn shell.

  2. Change to the root directory, using the cd / command. Verify that you're in the root directory, using the pwd command. The response is: /

  3. With the current directory as the root directory, create a directory samples by using the following command:

    $mkdir samples
  4. At the $ prompt, type the following command:

    $vi va.c.
  5. Type the code as given below for VA.c file. You can also copy and paste the code, if it is already coded and available in a windows text file.

    /* VA.C: The program below illustrates passing a variable
     * number of arguments using the following macros:
     *      va_start            va_arg              va_end
     *      va_list             va_dcl (UNIX only) */
    #include <stdio.h>
    #include <varargs.h>
    int average( va_list );
    void main( void )
    {
       /* Call with 3 integers (-1 is used as terminator). */
       printf( "Average is: %d\n", average( 2, 3, 4, -1 ) );
       /* Call with 4 integers. */
       printf( "Average is: %d\n", average( 5, 7, 9, 11, -1 ) );
       /* Call with just -1 terminator. */
       printf( "Average is: %d\n", average( -1 ) );
    }
    /* Returns the average of a variable list of integers. */
    int average( va_alist )
    va_dcl
    {
       int count = 0, sum = 0, i ;
       va_list marker;
       i = va_arg(marker, int);
       va_start( marker );
       while( i != -1 )
       {
          sum += i;
          count++;
          i = va_arg( marker, int);
       }
       va_end( marker );              /* Reset variable arguments.      */
       return( sum ? (sum / count) : 0 );
    }
    
  6. On completion of the code and while in the vi editor, press the ESC key, then the : key (the colon key), followed by wq to save the file and exit from the editor application.

  7. To compile this file, use the gcc compiler as shown below:

    $gcc –o lab1 va.c

    Following errors will appear on the screen:

    args.c: In function `main':
    args.c:19: warning: passing arg 1 of `average' makes pointer from integer without a
    cast
    args.c:19: too many arguments to function `average'
    args.c:22: warning: passing arg 1 of `average' makes pointer from integer without a
    cast
    args.c:22: too many arguments to function `average'
    args.c:25: warning: passing arg 1 of `average' makes pointer from integer without a
    cast
    args.c:17: warning: return type of `main' is not `int'
    args.c: In function `average':
    args.c:33: argument `__builtin_va_alist' doesn't match prototype
    args.c:13: prototype declaration
  8. The above errors occur because INTERIX does not support the Pre-ANSI version of args (which is available in <varargs.h>)

    Now, you need to modify the code to work on INTERIX, but still retain backward compatibility with Pre-ANSI version. This is accomplished by introducing the flag "#ifdef ANSI" and writing the ANSI version for varargs as listed below:

    /* VA.C: The program below illustrates passing a variable
     * number of arguments using the following macros:
     *      va_start            va_arg              va_end
     *      va_list             va_dcl (UNIX only) */
    #include <stdio.h>
    #define ANSI            /* Comment out for Pre-ANSI version     */
    #ifdef ANSI             /* ANSI compatible version          */
    #include <stdarg.h>
    int average( int first, ... );
    #else                   /* Pre-ANSI version          */
    #include <varargs.h>
    int average( va_list );
    #endif
    void main( void )
    {
       /* Call with 3 integers (-1 is used as terminator). */
       printf( "Average is: %d\n", average( 2, 3, 4, -1 ) );
       /* Call with 4 integers. */
       printf( "Average is: %d\n", average( 5, 7, 9, 11, -1 ) );
       /* Call with just -1 terminator. */
       printf( "Average is: %d\n", average( -1 ) );
    }
    /* Returns the average of a variable list of integers. */
    #ifdef ANSI             /* ANSI compatible version    */
    int average( int first, ... )
    #else
    int average( va_alist )
    va_dcl
    #endif
    {
       int count = 0, sum = 0, i ;
       va_list marker;
    #ifdef ANSI
       i = first;
       va_start( marker, first );     /* Initialize variable arguments. */
    #else
       i = va_arg(marker, int);
       va_start( marker );
    #endif
       while( i != -1 )
       {
          sum += i;
          count++;
          i = va_arg( marker, int);
       }
       va_end( marker );              /* Reset variable arguments.      */
       return( sum ? (sum / count) : 0 );
    }
  9. To compile this file, use the gcc compiler as shown below:

    $gcc –o lab1 va.c
  10. In the above compilation step, the output file is named as lab1 (the name that appears after the '-o' option).

    Note: You can give any name to an output file.

  11. You can now execute the application by typing the output file name and its path. The path is required, because the samples directory is not in the PATH environment variable. For all other labs, you can add the sample directory to the PATH variable or execute the application by providing the complete file name.

    $ /samples/lab1
    Average is: 3
    Average is: 8
    Average is: 0

    The code has a conditional statement for using different versions of the functions to calculate the average. This is identified by the Macro Constant ANSI. If the same application is compiled and executed in a UNIX environment (with Pre-ANSI compiler), the second version of the include file and the average function will be used. The output of the application will still remain the same.

System Information

  • BSD systems uses

    • Sysctl()

  • System V provides

    • Sysinfo()

  • POSIX routines can access some of this information

  • POSIX system information routines:

    • confstr(): retrieves string values

    • fpathconf(), pathconf(): retrieve configurable pathname variables

    • sysconf(): retrieve system information

Getting system information

There are different historical ways to determine information about the system. On BSD systems, the sysctl() interface provides access to system information, while on System V systems, the sysinfo() call provides system information.

Some of this information can be obtained using POSIX routines and some cannot. The POSIX system information routines return strings, paths, and numeric values (including two-valued boolean conditions).

The header file <limits.h> also contains the following macros defining system limits.

  • confstr()

    Retrieve string values. The only portable value of confstr() is _CS_PATH, a value for PATH, guaranteed to find the standard utilities.

  • fpathconf(), pathconf()

    Retrieve configurable pathname variables, such as the maximum size of a file name or the maximum link count. Familiar to System V programmers.

  • sysconf()

    Retrieve system information, such as whether a job control is available, whether POSIX options are supported, the limits for bc, and the maximum number of bytes allowed as an argument to exec().

Lab 2: System Information

Compile the following program using the following steps.

  • Run the INTERIX Korn shell.

  • Browse to the samples directory

     $cd samples 
  • At the $ prompt, type the following command:

     $vi syssample.c. 
  • Type the code as given below for the syssample.c file. You can also copy and paste the code, if it is already coded and available in a Windows text file.

Compile the following program using the following steps.

  1. Run the INTERIX Korn shell.

  2. Browse to the samples directory

    $cd samples
  3. At the $ prompt, type the following command:

    $vi syssample.c.
  4. Type the code as given below for the syssample.c file. You can also copy and paste the code, if it is already coded and available in a Windows text file.

    /* syssample.C: The program below illustrates usage of sysinfo functions*/
    #include <stdio.h>
    #include <unistd.h>
    void main( void )
    {
    /* retrieve the system information */
       long sinfo;
       sinfo = sysconf(_SC_VERSION);
       printf("Version supported: %d\n",sinfo);
       sinfo = sysconf(_SC_LINE_MAX);
       printf("Maximum line: %d\n",sinfo);    return;
    }
  5. On completion of the code and while in the vi editor, press the ESC key, then the : key (the colon key), followed by wq to save the file and exit from the editor application.

  6. To compile this file, use the gcc compiler as shown below:

    $ gcc –o syssample syssample.c

    (ignore complaints from gcc about the return type of main)

  7. Now you can execute the application by typing the output file name and its path:

    $ syssample
    Version supported: 199009
    Maximum line: 2048
  8. You can use the other parameters of the sysconf() function to verify the results.

  9. Following is an excerpt from the main page of sysconf():

    NAME

    sysconf() – get configurable system variables

    SYNOPSIS

    #include <unistd.h>

    long sysconf (int name)

    DESCRIPTION

    This interface is defined by POSIX.1-1988 standard.

    The sysconf() function provides a method for applications to determine the current value of a configurable system limit or option variable. The name argument specifies the system variable to be queried. Symbolic constants for each name value are found in the include file <unistd.h>.

    The definitive list of symbolic constants is in the include file and are also available in the complete text of manual for sysconf().

  10. Following is an excerpt from the main page of confstr()

    NAME

    confstr() – get string-valued configurable variables

    SYNOPSIS

    #include <unistd.h>

    size_t confstr (int name, char * buf, size_t len)

    DESCRIPTION

    The confstr() function provides a method for applications to get configuration defined string values.

    The name argument specifies the system variable to be queried. Symbolic constants for each name value are found in the include file <unistd.h>. The len argument specifies the size of the buffer referenced by the argument buf. If len is non-zero, buf is a non-null pointer, and name has a value. Up to len – 1 bytes of the value are copied into the buffer buf. The copied value is always null terminated.

    The available values are as follows:

    _CS_PATH: Return a value for the PATH environment variable that finds all the standard utilities.

    When the macros _ALL_SOURCE or _XOPEN_SOURCE are defined, or if

    _POSIX_C_SOURCE==2, then these are also available:

    _CS_SHELL - The POSIX shell.

    _CS_INSTALLEDDIR - The directory in which INTERIX is installed.

    _CS_TMPDIR - The INTERIX temporary directory.

    _CS_ETCDIR

    _CS_BINDIR

    _CS_INCLUDEDIR

    _CS_LIBDIR

    _CS_USRDIR

    _CS_PUBSDIR

SIGNALS

  • Signals are various notifications sent to a process in order to notify it of various "important" events

  • interrupt whatever the process is doing at this minute, and force it to handle them immediately

  • Each signal may have a signal handler, which is a function that gets called when the process receives that signal

  • Three typical ways for sending signal

    • Keyboard keys

    • Command line

    • System Calls

What are signals?

Signals are notifications sent to a process to notify the process of various important events. A signal causes a currently running process to halt and forces the operating system to handle the signal. Each signal is represented by an integer value, such as 1 or 2, as well as a symbolic name. This name is usually defined in a header file, /usr/include/signal.h.

Each signal may have a signal handler, which is a function that is called when the process receives the signal. A signal handler is said to be in asynchronous mode, as it is not called by the code in a program. Instead, when a signal is sent to the process, the operating system stops the execution of the process, and forces it to call the signal handler. After executing the signal, the signal handler returns the control to the process that was running prior to the interruption caused by the signal.

Signals are very similar to interrupts in their behavior. The difference is that while interrupts are sent to the operating system by the hardware, signals are sent to the process by the operating system, or by other processes. Note that signals are not related to software interrupts, which are still sent by the hardware.

There are three different ways a signal can be sent to a process.

  • Sending the signal from the keyboard

  • Sending the signal through shell provided commands

  • Sending the signals from a process using system calls

    Each of these methods are discussed in the following sections.

Sending signals using the Keyboard

The most common way of sending signals to processes is by using the keyboard. There are certain key combinations that are interpreted by the system as requests to send signals to the process with which you are interacting:

  • Ctrl-C

    Pressing this key combination causes the system to send an INT signal (SIGINT) to the running process. This signal causes the process to terminate immediately.

  • Ctrl-Z

    Pressing this key combination causes the system to send a TSTP signal (SIGTSTP) to the running process. This signal causes the process to suspend execution.

  • Ctrl-\

    Pressing this key combination causes the system to send a ABRT signal (SIGABRT) to the running process. This signal causes the process to terminate immediately. In UNIX systems, this signal would cause the process to generate a core image of the process, which can be examined (using a debugger) to determine the status of the process and its variables.

Sending Signals from the Command Line

Another way of sending signals to processes is through commands, usually internal to the shell:

  • kill

    The kill command accepts two parameters: a signal name (or number), and a process ID. The syntax for the kill command resembles the following:

    $ kill - <signal> <PID>

    For example, the following command would send the INT signal to process with PID 5342:

    $ kill -INT 5342 

    This has the same effect as pressing Ctrl-C in the shell that runs that process.

If no signal name or number is specified, by default a TERM signal is sent to the process, which normally causes the process to terminate and hence the name, kill.

NAME

kill - terminate or signal a process

SYNOPSIS

kill [-s signal_name] pid ...

kill [-signal_name] pid ...

kill [-signal_number] pid ...

kill [-l|-m]

DESCRIPTION

The kill utility sends the TERM signal to the processes specified by the pid operand(s).

Only a process with appropriate privileges may send signals to other users' processes.

The options are as follows:

-l Lists the signal names.

-m Lists the POSIX.2 signal number to signal name mapping.

-s signal_name

A symbolic signal name specifying the signal to be sent instead of the default TERM.

-signal_name

Equivalent to -s signal_name.

-signal_number

A non-negative decimal integer, specifying the signal to be sent.

Some of the more commonly used signals include:

1 SIGHUP (hangup)

2 SIGINT (interrupt)

3 SIGQUIT (quit)

6 SIGABRT (abort)

9 SIGKILL (non-catchable, non-ignorable kill)

14 SIGALRM (alarm clock)

15 SIGTERM (software termination signal)

DIAGNOSTICS

The kill utility exits 0 on success, and >0 if an error occurs.

Sending signals using System Calls

A third way of sending signals to processes is by using the kill system call. This is the normal way of sending a signal from one process to another. This system call is also used by the kill command.

The following is a sample code that causes a process to suspend its execution by sending the STOP signal to itself:

#include <unistd.h>     /* standard unix functions, like getpid()       */

#include <sys/types.h> /* various type definitions, like pid_t */

#include <signal.h> /* signal name macros, and the kill() prototype */

/* first, find my own process ID */

pid_t my_pid = getpid();

/* now that i got my PID, send myself the STOP signal. */

kill(my_pid, SIGSTOP);

The man page of the kill system call is as follows:

NAME

kill() – terminate or signal a process

SYNOPSIS

#include <signal.h>

int kill (pid_t pid, int sig)

DESCRIPTION

The kill () function sends the signal given by sig to pid, a process or a group of processes.

Sig may be one of the signals specified for sigaction () or it may be 0. If

sig is 0, error checking is performed but no signal is actually sent. This can also be used to check the validity of pid.

For a process to have permission to send a signal to the process pid, the real or effective user ID of the receiving process must match the real or effective user ID of the sending process or the user must have appropriate privileges for receiving and sending signals. These user ID tests are not applied when sending SIGCONT to a process, which is a member of the same session as the sending process.

If pid >0, Sig is sent to the process whose ID is equal to pid.

If pid =0, Sig is sent to all processes whose group ID is equal to the process group ID of the sender, and for which the sender process has permission to send the signal.

If pid = –1, Sig is sent to all processes for which the sender process has permission to send the signal.

If pid < –1, Sig is sent to all processes whose group ID is equal to the absolute value of pid, and for which the sender process has permission to send the signal.

The kill () function only delivers signals to processes in the POSIX subsystem (POSIX processes) and not to Win32 processes. If you attempt to kill a process which has exec()ed a Win32 process, the signal will not be delivered to the Win32 process.

RETURN VALUES

Upon successful completion, 0 is returned. Otherwise, –1 is returned and errno is set to indicate the error.

Signals in POSIX

  • POSIX defined a new signal mechanism based on the API sigaction()

  • Different from BSD or System V signals

  • Use symbolic names instead of actual values in the code

  • Known problems with historical signal implementations

    • System V3 signal: action reset to default

    • BSD: does not reset

  • In POSIX sigaction() call does not reset the default if the handler returns normally

Signals in POSIX.1

Signals on a POSIX.1 system are neither BSD nor SVR4. POSIX has defined a new signal mechanism based on the API, sigaction().

The available set of signals is described in POSIX.1 and the Single Unix Specification. Your code should use the symbolic names instead of the actual integer values, as some signal numbers may differ from traditional implementations.

The POSIX.1 committee introduced the new signal semantics because of the problems with historical signal implementations found on the BSD and System V systems. When the System V3 signal() catches a signal, the action associated with the signal is reset to default. In 4.3BSD, it is not reset. In the ISO/ANSI C standard, the signal() function either resets the default or does an implementation-defined blocking of the signal. The POSIX sigaction() call does not reset the default if the handler returns normally.

Signals in INTERIX

  • Interix support three different sets of signal-handling APIs

    • ANSI C Signal: this is built on top of the POSIX.1

    • POSIX.1 Signals

    • BSD 4.3 Signals

    • System V Signals

  • All the SVID IPC mechanisms, Berkeley sockets, and the Berkeley string and memory functions are available in Interix

  • The historical Berkeley and System V signal models were replaced with POSIX.1 signals

INTERIX follows the POSIX signal semantics. An INTERIX process has a signal mask. Therefore, this process can choose to block certain signals from arriving. (You cannot block SIGKILL or SIGQUIT.) A process starts with a signal mask inherited from its parent. If any signals are generated and then blocked by the signal mask, they get added to the set of pending signals. If you are using the API, signal(), note that the signal is still masked and remains masked until the mask is cleared. This can cause a problem if your code calls longjmp() from the handler. Converting from signal() to use sigaction() directly and siglongjmp() to longimp() will clear up some of those unexpected behaviors.

INTERIX supports three different sets of signal-handling APIs, although it only supports one set of signal semantics (the POSIX.1 set):

  • ANSI C signals, supported with the function signal(). Because s ANSI C signals API is built on top of the POSIX.1 sigaction() model, it behaves slightly different from what you might expect.

    NAME

    signal() – specify signal handling (C Standard version)

    SYNOPSIS

    #include <signal.h>

    void (*signal (int sig, void (* func)(int) ) )(int)

    DESCRIPTION

    The signal() call allows a process to catch, ignore, or generate an interrupt on receiving a signal. (The exceptions are SIG_KILL and SIG_STOP, which cannot be caught or ignored.) The recommended call is the POSIX Sigaction () call, instead of signal(). Sigaction()is slightly more robust than signal().

    sig is the signal number (possible values are listed below or in the file <signal.h>).

    The func argument can either represent a function to catch and handle the signal or one of the macros given below:

    SIG_DFL - Set the signal to the default action (listed below).

    SIG_IGN - Ignore the signal and discard pending instances. If SIG_IGN is not used, further occurrences of the signal are automatically blocked and func is called.

  • POSIX.1 signals, supported with the functions sigaction(),sigpending(), sigprocmask(), sigsuspend(), sigemptyset(), sigfillset(),sigaddset(), sigdelset(), sigismember().

    NAME

    sigaction() – software signal facilities

    SYNOPSIS

    #include <signal.h>

    sigaction (int sig, struct sigaction * act, struct sigaction * oact)

    DESCRIPTION

    The system defines a set of signals that may be delivered to a process. On receiving a signal, a process takes some action; the process may ignore the signal, it may deliver the signal to a specified handler function, it may block the signal (delivery is postponed until the signal is unblocked), or it may take the default action. The sigaction () call allows the calling process to examine or specify the action to be taken when a signal is received.

  • BSD 4.3 signals are supported with the functions killpg(), sigsetmask(), sigblock(), and sigvec(). The signal mask for these functions is of type int, not a sigset_t. If a future release of INTERIX supports more than 32 signals, these functions will become obsolete. Rather than depending on these functions, convert your code to use the POSIX.1 signal calls. Sigpause() is provided as the System V call, which does not behave in the same way as the BSD call.

  • System V signals are supported by the functions sighold(), sigignore(), sigpause(), sigrelse(), and sigset() .

LAB 3: Signals in POSIX

The code listed in the box below causes the program to print the string "Don't do that" when a user presses Ctrl-C:

  • Compile the following program using the listed steps:

  • Run the INTERIX Korn shell.

  • Browse to the samples directory:

     $cd samples  
  • At the $ prompt, type the following command:

     $vi catch-ctrl-c.c 
  • Type the code as given below for the catch-ctrl-c.c file. You can also copy and paste the code, if it is already coded and available in a Windows text file.

Catching signals

The code listed in the box below causes the program to print the string "Don't do that" when a user presses Ctrl-C:

Compile the following program using the listed steps:

  1. Run the INTERIX Korn shell.

  2. Browse to the samples directory:

    $cd samples 
  3. At the $ prompt, type the following command:

    $vi catch-ctrl-c.c
  4. Type the code as given below for the catch-ctrl-c.c file. You can also copy and paste the code, if it is already coded and available in a Windows text file.

    /* catch-ctrl-c.c sample application */
    #include <stdio.h>     /* standard I/O functions                         */
    #include <unistd.h>    /* standard unix functions, like getpid()         */
    #include <signal.h>    /* signal name macros, and the signal() prototype */
    /* first, here is the signal handler */
    void catch_int(int sig_num)
    {
        /* re-set the signal handler again to catch_int, for next time */
        signal(SIGINT, catch_int);
        printf("Don't do that\n");
        fflush(stdout);
    }
    int main(int argc, char* argv[])
    {
        /* set the INT (Ctrl-C) signal handler to 'catch_int' */
        signal(SIGINT, catch_int);
        /* now, lets get into an infinite loop of doing nothing. */
        for ( ;; )
            pause();
    }
  5. On completion of the code and while in the vi editor, press the ESC key, then the : key (the colon key), followed by wq to save the file and exit from the editor application.

  6. To compile the code in the above box, use the gcc compiler as shown below:

    $ gcc –o ctrlc catch-ctrl-c.c
  7. Now you can execute the application by typing the output file name and its path. Press CTRL+C several times…

    $ ctrlc
    ^CDon't Do that
    ^CDon't Do that
    ^CDon't Do that

    Use CTRL+Z to exit to the prompt. The program is stopped, but still loaded in memory. Kill it, using:

    $ kill -9 $(ps –ef | grep ctrlc | grep –v grep | tr –s " " " " | cut –f 3 –d " ")

Masking signals

One of the nasty problems that might occur when handling a signal is the occurrence of a second signal while the signal handler function executes. Such a signal might be of a different type than the one being handled, or even of the same type. As a result, you should take some precautions inside the signal handler function, to avoid races. Fortunately, the system also contains some features that will allow you to block signals from being processed.

Masking signals with sigprocmask()

The POSIX function used to mask signals in the global context is the sigprocmask() system call. It allows you to specify a set of signals to block, and returns a list of signals that were previously blocked.

NAME

sigprocmask() – manipulate current signal mask

SYNOPSIS

#include <signal.h>

int sigprocmask(int how, const sigset_t * set, sigset_t * oset)

DESCRIPTION

The sigprocmask() function examines and/or changes the current signal mask (those signals that are blocked from delivery). Signals that belong to the current signal mask set are blocked. The oset argument is set by the call to the previous value of the signal mask. If it is set to NULL, it is ignored. The set argument points to a signal set that describes the changes to be made to the process's signal mask. If set is NULL, the mask is not changed.

/* define a new mask set */

sigset_t mask_set;

/* first clear the set (i.e. make it contain no signal numbers) */

sigemptyset(&mask_set);

/* lets add the TSTP and INT signals to our mask set */

sigaddset(&mask_set, SIGTSTP);

sigaddset(&mask_set, SIGINT);

/* and just for fun, lets remove the TSTP signal from the set. */

sigdelset(&mask_set, SIGTSTP);

/* finally, lets check if the INT signal is defined in our set */

if (sigismember(&mask_set, SIGINT)

printf("signal INT is in our set\n");

else

printf("signal INT is not in our set - how strange...\n");

/* finally, lets make the set contain ALL signals available on our system */

sigfillset(&mask_set)

LAB 4: Using Signals with Masks

In this lab, you will count the number of Ctrl-C signals a user has hit, and on the fifth hit, ask the user if the user really want to exit. Compile the following program using the listed steps:

  • Run the INTERIX Korn shell.

  • Browse to the samples directory:

     $cd samples 
  • At the $ prompt, type the following command:

     $vi count-ctrl-c.c. 
  • Type the code as given below for the count-ctrl-c.c file. You can also copy and paste the code, if it is already coded and available in a Windows text file.

In this lab, you will count the number of Ctrl-C signals a user has hit, and on the fifth hit, ask the user if the user really want to exit. Furthermore, if the user hits Ctrl-Z, the number of Ctrl-C presses is printed on the screen.

Compile the following program using the listed steps:

  1. Run the INTERIX Korn shell.

  2. Browse to the samples directory:

    $cd samples
  3. At the $ prompt, type the following command:

    $vi count-ctrl-c.c.
  4. Type the code as given below for the count-ctrl-c.c file. You can also copy and paste the code, if it is already coded and available in a Windows text file.

    /* count-ctrl-c.c -- program to count 5 ^C presses before asking to Exit */
    #include <stdio.h>     /* standard I/O functions                         */
    #include <unistd.h>    /* standard unix functions, like getpid()         */
    #include <signal.h>    /* signal name macros, and the signal() prototype */
    /* first, define the Ctrl-C counter, initialize it with zero. */
    int ctrl_c_count = 0;
    #define     CTRL_C_THRESHOLD     5
    /* the Ctrl-C signal handler */
    void catch_int(int sig_num)
    {
        sigset_t mask_set;     /* used to set a signal masking set. */
        sigset_t old_set;     /* used to store the old mask set.   */
        /* re-set the signal handler again to catch_int, for next time */
        signal(SIGINT, catch_int);
        /* mask any further signals while you are inside the handler. */
        sigfillset(&mask_set);
        sigprocmask(SIG_SETMASK, &mask_set, &old_set);
        /* increase count, and check if threshold was reached */
        ctrl_c_count++;
        if (ctrl_c_count >= CTRL_C_THRESHOLD) {
         char answer[30];
         /* prompt the user to tell you if to really exit or not */
         printf("\nRealy Exit? [y/N]: ");
         fflush(stdout);
         gets(answer);
         if (answer[0] == 'y' || answer[0] == 'Y') {
            printf("\nExiting...\n");
            fflush(stdout);
            exit(0);
         }
         else {
            printf("\nContinuing\n");
            fflush(stdout);
            /* reset Ctrl-C counter */
            ctrl_c_count = 0;
         }
        }
        /* restore the old signal mask */
        sigprocmask(SIG_SETMASK, &old_set, NULL);
    }
    /* the Ctrl-Z signal handler */
    void catch_suspend(int sig_num)
    {
        sigset_t mask_set;     /* used to set a signal masking set. */
        sigset_t old_set;     /* used to store the old mask set.   */
        /* re-set the signal handler again to catch_suspend, for next time */
        signal(SIGTSTP, catch_suspend);
        /* mask any further signals while you are inside the handler. */
        sigfillset(&mask_set);
        sigprocmask(SIG_SETMASK, &mask_set, &old_set);
        /* print the current Ctrl-C counter */
        printf("\n\nSo far, '%d' Ctrl-C presses were counted\n\n", ctrl_c_count);
        fflush(stdout);
        /* restore the old signal mask */
        sigprocmask(SIG_SETMASK, &old_set, NULL);
    }
    int main(int argc, char* argv[])
    {
        /* set the Ctrl-C and Ctrl-Z signal handlers */
        signal(SIGINT, catch_int);
        signal(SIGTSTP, catch_suspend);
        /* enter an infinite loop of waiting for signals */
        for ( ;; )
         pause();
        return 0;
    }
  5. On completion of the code and while in the vi editor, press the ESC key, then the : key (the colon key), followed by wq to save the file and exit from the editor application.

_____________________________________________________________________________________

  1. To compile this file, use the gcc compiler as shown below:

    $ gcc –o countctrlc count-ctrl-c.c
  2. Now you can execute the application by typing the output file name and its path:

    $ countctrlc
    ^C^C^C^C^C
    Really Exit? [y/N]:

    You can press Y to exit or continue by pressing any key. The application will again wait for the 5 Ctrl + C key presses to show you this message.

  3. Enter CTRL+Z, and the program will output this message:

       So far, '3' Ctrl-C presses were counted

    The number will depend on how many times you have pressed Ctrl-c before pressing ctrl-z.

    Any other CTRL+key combination will be displayed on the console in the following manner:

       ^X for CTRL+X
       ^B for CTRL+B

    Pressing Y to exit, will yield:

       $ countctrlc
       ^C^C^C^C^C
       Realy Exit? [y/N]: warning: this program uses gets(), which is unsafe.
       y
       Exiting...

    It is left as an exercise for the student to modify the code to use fgets(), which is inherently safer. The use of gets() leaves you subject to overruns.

Signals in INTERIX

  • The list of signals that are generated on INTERIX is mentioned in the table below.

  • The default action for all of them is terminating the process. (Other signals are defined in <signal.h>, so you can pass them to kill(), but these signals are not generated.)

The following table lists the signals that are generated on INTERIX along with their meaning.

Signal

Meaning

SIGABRT

Abnormal termination (see abort())

SIGALRM

Timeout (see alarm())

SIGFPE

Erroneous arithmetic operation

SIGHUP

Controlling terminal hung up

SIGILL

Hardware interrupt (illegal instruction)

SIGINT

Interactive interrupt

SIGIO I/O

Is possible on a descriptor

SIGKILL

Termination (cannot be caught or ignored)

SIGPIPE

Write on a pipe that isn't open for reading by any Process

SIGQUIT

Interactive termination signal

SIGSEGV

Invalid memory reference

SIGTERM

Termination signal

SIGURG

Urgent condition on socket

SIGUSR1

Application-defined signal 1

SIGUSR2

Application-defined signal 2

SIGVTALRM

Virtual time alarm

SIGWINCH

Window size change

Job-control signals

The following job-control signals are also provided:

Signal

Meaning

Default Action

SIGCHLD

Child process stopped or terminated

Ignored

SIGCONT

Continue

Continue if stopped

SIGSTOP

Stop (cannot be caught or ignored)

Stop process

SIGTSTP

Interactive stop

Stop process

SIGTTIN

Read from controlling terminal by Stop process background process

 

SIGTTOU

Write to controlling terminal by Stop process background process

 

Note that the SIGCHLD signal is not the System V SIGCLD signal. It is similar but has slightly different semantics.

From INTERIX 2.1 onwards, SIGCLD is synonymous with SIGCHLD, and SIGIOT is synonymous with SIGABRT.

Sockets in POSIX

  • INTERIX implements the Berkeley-style socket interfaces, including

    • bind(), accept(), and connect()

  • The INTERIX implementation uses the

    • Windows Winsock library to access the network

    • Thus all administration is still handled using the usual Windows NT tools.

  • Only the TCP/IP protocol for sockets is supported, and all of the protocols of the underlying Winsock implementation

A socket is a BSD method for accomplishing interprocess communication (IPC). This means that a socket is used to allow one process to speak to another, like how the telephone is used to allow one person to speak to another.

Because sockets can have several types, you must specify what type of socket you want when you create one. One option that you have is the addressing format of a socket. Just as the mail service uses a different scheme to deliver mail than the telephone company uses to complete calls, so can sockets differ. The two most common addressing schemes are AF_UNIX and AF_INET. The AF_UNIX addressing uses UNIX pathnames to identify sockets. These sockets are very useful for IPC between processes on the same machine. The AF_INET addressing uses Internet addresses that are four-byte numbers usually written as four decimal numbers separated by periods (such as 192.9.200.10). In addition to the machine address, there is also a port number that allows more than one AF_INET sockets on each machine. AF_INET addresses are the ones that are covered in this section, as they are the most useful and widely used. Also, this AF_INET family of sockets is based on TCP (not on UDP).

After you create a socket to receive calls, you must wait for calls to that socket. The socket now enters into listening mode, by calling listen(), before they can accept any connections. The accept() function is used to do this. Calling accept() is analogous to picking up the telephone if it's ringing. Accept() returns a new socket which is connected to the caller.

Server Client Communication

posix23

Typical Socket program

The above slide shows a typical scenario for a connection-oriented transfer. First, the server is started; then, sometime later, a client is started that connects to the server.

Clients and servers require a set of conventions before a service can be established. This set of conventions consists of a protocol that must be implemented at both ends of a connection.

A server process normally listens at a well-known port for service requests. Alternative schemes that use a multi-service server can be used to eliminate a number of server processes clogging the system while remaining dormant most of the time.

A network connection can be connection-oriented or connectionless. The former case is more like file I/O than the latter, because once you open a connection with another process, the network I/O on that connection is always with the same peer process. With a connectionless protocol, there is nothing like an "open" connection because every network I/O operation could be with a different port on a different host.

LAB 5: Socket Programming

Building TCP Socket Programs

  • To build a TCP socket program, you should use the following command format:

  • cc -o prog prog.c -lsocket

  • The following definitions apply to this format:

  • cc

    • C compile command. Indicates that prog.c will be compiled.

  • -o

    • Output option. Indicates that the compiled program will be put in prog.

  • prog

    • Indicates the resulting compiled program.

  • prog.c

    • Indicates the program source to be compiled.

  • -lsocket

    • Socket library. Implements the sockets API

Building TCP Socket Programs on Win NT/2000

To build a TCP socket program, you should use the following command format:

$gcc -o prog prog.c -lsocket

The following definitions apply to this format:

  • gcc: This is the C compile command. It indicates that prog.c will be compiled.

  • -o: This is the output option. It indicates that the compiled program will be put in prog.

  • prog: prog indicates the resultant compiled program.

  • prog.c: prog.c indicates the program source to be compiled.

  • -lsocket: Isocket is the Socket library. It implements the sockets API.

Compile the following program using the following steps:

  1. Run the INTERIX Korn shell.

  2. Browse to the samples directory:

    $cd samples
  3. At the $ prompt, type the following command:

    $vi tcp_sock_server.c.
  4. Type the code as given below for the tcp_sock_server.c file. You can also copy and paste the code, if it is already coded and available in a Windows text file.

    To build the following SERVER program, use the following command:

         $gcc -o tcp_sock_server tcp_sock_server.c –lsocket
    /* tcp_sock_server.c */
    #include <sys/types.h>
    #include <sys/socket.h>
    #include <netinet/in.h>
    #include <unistd.h>
    #include <stdlib.h>
    #include <string.h>
    #include <netdb.h>
    #include <stdio.h>
    /* This program creates a socket and begins an infinite loop.
     * Each time through the loop, it accepts a connection and prints
     * out messages from it. When the connection breaks, or a termination
     * message comes through, the program accepts a new connection. */
    /*ARGSUSED*/
    int
    main(int argc, char *argv[])
    {
         int sock;
         size_t length;
         struct sockaddr_in server;
         int msgsock;
         char buf[1024];
         int rval;
         /*
         * Create socket, PF_INET used here is actually a POSIX.1 standard, and
                * it is #defined to AF_INET.
         */      sock = socket(PF_INET, SOCK_STREAM, 0);
         if (sock < 0) {
              perror("opening stream socket");
              exit(1);
         }
         /*
         * Name socket using wildcards
         */      server.sin_family = AF_INET;
         server.sin_addr.s_addr = INADDR_ANY;
         server.sin_port = 0;
         if (bind(sock, (struct sockaddr *)&server, sizeof(server))) {
              perror("binding stream socket");
              exit(1);
         }
         /*
         * Find out assigned port number and print it out     */
         length = sizeof(server);
         if (getsockname(sock, (struct sockaddr *)&server, &length)) {
              perror("getting socket name");
              exit(1);
         }
         (void) printf("Socket has port #%d\n", ntohs(server.sin_port));
         (void) fflush(stdout);
         /*
         * Start accepting connections
         */      (void) listen(sock, 1);
         /*CONSTCOND*/      while (1) {
              msgsock = accept(sock, 0, 0);
              if (msgsock < 0) {
                   perror("accept");
                   break;           }
              do {
                   (void) memset(buf, 0, sizeof(buf));
                   rval = read(msgsock, buf, sizeof(buf));
                   if (rval < 0) {
                        perror("reading message");
                   } else if (rval == 0) {
                        (void) printf("Ending connection\n");
                   } else {
                        (void) printf("-->%.*s\n", rval, buf);
                   }                (void) fflush(stdout);
              } while (rval > 0);
              (void) close(msgsock);
         }
         (void) close(sock);
         return 1;
    }
  5. On completion of the code and while in the vi editor, press the ESC key, then the : key (the colon key), followed by wq to save the file and exit from the editor application.

  6. Repeat step 3 through 5 for the client code tcp_sock_client.c.

  7. Build the client program, use the following command:

    $gcc -o tcp_sock_client tcp_sock_client.c -lsocket
    /* tcp_sock_client.c */
    #include <sys/types.h>
    #include <sys/socket.h>
    #include <netinet/in.h>
    #include <unistd.h>
    #include <stdlib.h>
    #include <string.h>
    #include <netdb.h>
    #include <stdio.h>
    #define DATA "Socket program example… message from client"
    /*
     * This program creates a socket and initiates a connection with the socket
     * given in the command line. One message is sent over the connection and
     * then the socket is closed, ending the connection. The form of the command
     * line is:
     *      sock_client <hostname> <portnumber>
     */
    int
    main(int argc, char *argv[])
    {
         struct sockaddr_in server;
         struct hostent *hp;
         int s;
         if (argc != 3) {
              (void) fprintf(stderr, "usage: %s <hostname> <local port>\n",
                   argv[0]);
              exit(1);
         }
         /*
         * Create socket
         */      s = socket(PF_INET, SOCK_STREAM, 0);
         if (s < 0) {
              perror("opening stream socket");
              exit(1);
         }
         /*
         * Connect socket using name specified by command line
         */      server.sin_family = AF_INET;
         hp = gethostbyname(argv[1]);
         if (hp == 0) {
              (void) fprintf(stderr, "%s: unknown host\n", argv[1]);
              exit(1);
         }
         (void) memcpy(&server.sin_addr, hp->h_addr, hp->h_length);
         server.sin_port = htons((unsigned short)atoi(argv[2]));
         if (connect(s, (struct sockaddr *)&server, sizeof(server)) < 0) {
              perror("connecting stream socket");
              exit(1);
         }
         if (write(s, DATA, sizeof(DATA)) < 0) {
              perror("writing on stream socket");
              exit(1);
         }
         return 0;
    }
  8. Now you can execute the application by typing the output file name and its path. Remember to execute the TCP Server program first.

    $ tcp_sock_server
    Socket has port #8455429

    The port number will be different when you execute the program on your server.

  9. Now open another instance of the INTERIX Korn shell, browse to the samples directory, and then execute the client application with the following command:

    $tcp_sock_client localhost 8455429

    The second parameter – the port number, will be the same as the port number on which the server process is listening. If you give a wrong port number, the following error will be shown:

    Connecting stream socket: Connection refused

Check the server process and its port number and then retry the client application.

If the port number is correct, the server process will have a message shown on the console that will be the following:

Socket program example… message from client
Ending connection

The above text is a constant called DATA defined in the client program as shown below:

#define DATA "Socket program example… message from client"

System V IPC in POSIX

  • Interprocess Communication(IPC)

    • IPC provides different methods of communication between processes

    • IPC can be used between processes executing on single system, as well as processes executing on different systems

  • System V IPC

    • The three types of IPC

      • message queues

      • semaphores

      • shared memory

      are collectively referred to as "System V IPC"

    • They are used for process communication on a single system

    • They share similarities in their system call interface and in the information that kernel maintains on them

In a traditional single process programming, different modules within the single process can communicate with each other using global variables, function calls, and the argument of the results passed back and forth between functions and their callers. When dealing with separate processes, each executing within its own address space, there are more details to consider. For two processes to communicate with each other, they must both agree upon the same, and the operating system must provide some facilities for the Interprocess communication (IPC).

There are different types of IPCs. These are:

  • Message queues

  • Semaphores

  • Shared memory

These types are collectively referred to as System V IPC. They share similarities in system calls that access them, and in the information that the kernel maintains on them.

Examples:

  • msgget - System call to create or open a message queue

  • semget - System call to create or open a semaphore

  • shmget - System call to create or open a shared memory

System V IPC in POSIX

  • Similar to information maintained for files, kernel maintains information for each type of IPC. For example the permission structure maintained by kernel for a message queue is:

    Struct ipc    perm{
         ushort uid; /*owners user id */
         ushort gid; /* owners group id */
         ushort cuid;         /*creators user id */
         ushort cgid;         /*creators group id */
         ushort mode;         /*access modes */
         ushort seq; /*slot usage sequence number */
         Key_tkey;   /*key*/
    };
  • The above structure and other constants for System V IPC calls are defined in <sys/ipc.h>

The kernel maintains a structure of information for every IPC channel. This is very similar to the information maintained for files. The above slide illustrates an ipc_perm structure.

The structure ipc_perm, and other manifest constants for the system V IPC calls, are defined in <sys/ipc.h> under the INTERIX environment.

Message Queues

  • Message queues

    • Message queues is a mechanism of sharing data using queues among processes.

    • The data is shared in the form of a message.

    • All message queues in System V implementation are maintained by kernel

    • The queues have an associated message queue identifier (msqid)

    • A process can read and write messages to many queues

    • Many processes can share the same queue for read and write

    • Every message on a queue has the following attributes:

      • Message queue identifier

      • length of the data portion of the message (can be zero)

      • data (if the length is greater than zero)

Message queues

Message queues is a mechanism of sharing data using queues among processes. The data is shared in the form of a message.

All message queues in System V implementation are maintained by the kernel.

Some operating systems restrict the passing of messages, such that a process can only send a message to another specific process. System V has no such limitations.

In the System V implementation of messages, all messages have an associated message queue identifier. This identifier identifies the particular queue of messages.

Every message in a message queue has some attributes. The above slide describes the message attributes.

Some important system calls used in a message queue are listed below:

A new message queue is created, or an existing message queue is accessed with the msgget() system call. The signature of the call is:

    #include <sys/types.h>
     #include <sys/ipc.h>
     #include <sys/msg.h>
     int msgget(key_t key, int msgflag);

Once a message queue is created, you can put messages on the queue using the msgsnd() system call. The signature of the call is:

    #include <sys/types.h>
     #include <sys/ipc.h>
     #include <sys/msg.h>
int msgsnd(int msqid, struct msgbuf *ptrkey, int length, int flag);

A message is read from the messages on a queue using the msgrcv() system call. The signature of the call is:

    #include <sys/types.h>
     #include <sys/ipc.h>
     #include <sys/msg.h>
int msgrcv(int msqid, struct msgbuf *ptrkey, int length, long msgtype, int flag);

The msgctl() system call provides a variety of control operations on a message queue. For example, a message queue is deleted using this call. The signature of the call is:

    #include <sys/types.h>
     #include <sys/ipc.h>
     #include <sys/msg.h>
     int msgctl(int msqid, int cmd, struct msgqid_ds *buff);

LAB 6: Message Queues

  • A simple client-server application demonstrates the message queues interface in System V IPC

    • The application involves two message queues between the client and the server.

    • The client reads the filename from the standard input and writes it to the first message queue

    • The server reads a filename on the first queue, reads the content of the file and writes it on to the second queue.

    • The client reads the file content from the second message queue and writes into the standard output

    • The following figure explains the client server example

      posix28

Building the client-server application using message queues

There are three files in a client-server application: server.c, client.c, and submsg.c.

  • submsg.c is a common file used by server.c and client.c. The submsg.c file contains read and write function implementation for the client and the server.

  • server.c is a server program.

  • client.c is a client program.

Compile the following program using the following steps:

  1. RUN the INTERIX Korn shell.

  2. Browse to the samples directory by typing the following command:

    $cd samples
  3. At the $ prompt , type the following command:

    $vi server.c
  4. Type the code as given below for the server.c file. You can also copy and paste the code, if it is already coded and available in a Windows text file.

    /* server.c */
    #include        <sys/types.h>
    #include        <sys/ipc.h>
    #include        <sys/msg.h>
    #include        <sys/errno.h>
    extern int      errno;
    #define MKEY1   1234L
    #define MKEY2   2345L
    #define PERMS   0666
    /*
     * Definition of "our" message.
     */
    #define MAXMESGDATA     (4096-16)
    #define MESGHDRSIZE     (sizeof(Mesg) - MAXMESGDATA)
    typedef struct {
      int   mesg_len;       /* #bytes in mesg_data, can be 0 or > 0 */
      long  mesg_type;      /* message type, must be > 0 */
      char  mesg_data[MAXMESGDATA];
    } Mesg;
    Mesg    mesg;
    main()
    {
            int     readid, writeid;
            /*
             * Create the message queues, if required.
             */
            if ( (readid = msgget(MKEY1, PERMS | IPC_CREAT)) < 0)
                    printf("server: can't get message queue 1");
            if ( (writeid = msgget(MKEY2, PERMS | IPC_CREAT)) < 0)
                    printf("server: can't get message queue 2");
            server(readid, writeid);
            exit(0);
    }
    server(ipcreadfd, ipcwritefd)
    int     ipcreadfd;
    int     ipcwritefd;
    {
            int     n, filefd;
            char    errmesg[256];
            /*
             * Read the filename message from the IPC descriptor.
             */
            mesg.mesg_type = 1L;
            if ( (n = mesg_recv(ipcreadfd, &mesg)) <= 0)
                    printf("server: filename read error");
            mesg.mesg_data[n] = '\0';
            if ( (filefd = open(mesg.mesg_data, 0)) < 0) {
                    /*
                     * Error.  Format an error message and send it back
                     * to the client.
                     */
                    strcpy(errmesg, "can't open file");
                    strcat(mesg.mesg_data, errmesg);
                    mesg.mesg_len = strlen(mesg.mesg_data);
                    mesg_send(ipcwritefd, &mesg);
            } else {
                    /*
                     * Read the data from the file and send a message to
                     * the IPC descriptor.
                     */
                    while ( (n = read(filefd, mesg.mesg_data, MAXMESGDATA)) > 0) {
                            mesg.mesg_len = n;
                            mesg_send(ipcwritefd, &mesg);
                    }
                    close(filefd);
                    if (n < 0)
                            printf("server: read error");
            }
            /*
             * Send a message with a length of 0 to signify the end.
             */
            mesg.mesg_len = 0;
            mesg_send(ipcwritefd, &mesg);
    }
    
  5. On completion of the code and while in the vi editor, press the ESC key, then the : key (the colon key), followed by wq to save the file and exit from the editor application.

  6. Repeat step 3 through 5 for the client code client.c and submsg.c.

  7. Compile the submsg program.

  8. To compile the submsg program, you should use the following command format:

    $gcc -c submsg.c
    /*submsg.c*/
    /*
     * Definition of "our" message.
     */
    #define MAXMESGDATA     (4096-16)
    #define MESGHDRSIZE     (sizeof(Mesg) - MAXMESGDATA)
    typedef struct {
      int   mesg_len;       /* #bytes in mesg_data, can be 0 or > 0 */
      long  mesg_type;      /* message type, must be > 0 */
      char  mesg_data[MAXMESGDATA];
    } Mesg;
    /*
     * Send a message using the System V message queues.
     */
    mesg_send(id, mesgptr)
    int     id;
    Mesg    *mesgptr;
    {
            /*
             * Send the message - the type followed by the optional data.
             */
            if (msgsnd(id, (char *) &(mesgptr->mesg_type),
                                            mesgptr->mesg_len, 0) != 0)
                    printf("msgsnd error");
    }
    /*
     * Receive a message from a System V message queue.
     */
    int
    mesg_recv(id, mesgptr)
    int     id;
    Mesg    *mesgptr;
    {
            int     n;
            /*
             * Read the first message on the queue of the specified type.
             */
            n = msgrcv(id, (char *) &(mesgptr->mesg_type), MAXMESGDATA,
                                            mesgptr->mesg_type, 0);
            if ( (mesgptr->mesg_len = n) < 0)
                    /* err_dump("msgrcv error"); */
                    printf("msgrcv error");
            return(n);              /* n will be 0 at end of file */
    }
    
  9. Build the server program.

    To build the server program, you should use the following command format:

    $gcc –c server.c
    $gcc –o server server.o submsg.o

    The "–c" option compiles submsg.c, but does not link it.

  10. Build the client program.

    To build the client program, you should use the following command format:

    $gcc –c client.c
    $gcc –o client client.o submsg.o

    The "–c" option compiles submsg.c, but does not link it.

    /* client.c*/
    #include       <stdio.h>
    #include        <sys/types.h>
    #include        <sys/ipc.h>
    #include        <sys/msg.h>
    #include        <sys/errno.h>
    extern int      errno;
    #define MKEY1   1234L
    #define MKEY2   2345L
    #define PERMS   0666
    /*
     * Definition of "our" message.
     */
    #define MAXMESGDATA     (4096-16)
    #define MESGHDRSIZE     (sizeof(Mesg) - MAXMESGDATA)
    typedef struct {
      int   mesg_len;       /* #bytes in mesg_data, can be 0 or > 0 */
      long  mesg_type;      /* message type, must be > 0 */
      char  mesg_data[MAXMESGDATA];
    } Mesg;
    Mesg    mesg;
    main()
    {
            int     readid, writeid;
            /*
             * Open the message queues. The server must have
             * already created them.
             */
            if ( (writeid = msgget(MKEY1, 0)) < 0)
                    printf("client: can't msgget message queue 1");
            if ( (readid = msgget(MKEY2, 0)) < 0)
                    printf("client: can't msgget message queue 2");
            client(readid, writeid);
            /*
             * Now we can delete the message queues.
             */
            if (msgctl(readid, IPC_RMID, (struct msqid_ds *) 0) < 0)
                    printf("client: can't RMID message queue 1");
            if (msgctl(writeid, IPC_RMID, (struct msqid_ds *) 0) < 0)
                    printf("client: can't RMID message queue 2");
            exit(0);
    }
    client(ipcreadfd, ipcwritefd)
    int     ipcreadfd;
    int     ipcwritefd;
    {
            int     n;
            /*
             * Read the filename from standard input, write it as
             * a message to the IPC descriptor.
             */
            if (fgets(mesg.mesg_data, MAXMESGDATA, stdin) == NULL)
                    printf("filename read error");
            n = strlen(mesg.mesg_data);
            if (mesg.mesg_data[n-1] == '\n')
                    n--;
            mesg.mesg_len = n;
            mesg.mesg_type = 1L;
            mesg_send(ipcwritefd, &mesg);
            /*
             * Receive the message from the IPC descriptor and write
             * the data to the standard output.
             */
            while ( (n = mesg_recv(ipcreadfd, &mesg)) > 0)
                    if (write(1, mesg.mesg_data, n) != n)
                            printf("data write error");
            if (n < 0)
                    printf("data read error");
    }
  11. Create a file named test.txt under the current directory. The test file should contain the following two lines:

    This is a test file.
    Testing message queues.
  12. Now execute the server application by typing the output file name and its path. Remember to execute the server program first.

    $ ./server &
    The server application is run in background
  13. Now execute the client application with the following command:

    $ ./client <ENTER>

Now the client application will wait for the file name as the input from the user. Specify the path of the file:

test.txt

If the file exists and the file path is correct, the client process displays the following message on the console:

This is a test file.
Testing message queues

After displaying the above message, the client and the server programs terminate.

Semaphores

  • Semaphores

    • Semaphores are a synchronization primitive.

    • When a resource is shared among multiple processes semaphores are used to allow only one process to use the resource at a time.

    • They are used to implement the locking mechanism for a resource

    • In System V implementation, a semaphore can have multiple sets of values.

    • A semaphore value is identify by a non-negative integer value

Semaphores

Semaphores enable multiple processes to synchronize their operations. In a multi-processing environment, where a resource is shared among many processes, semaphores allow only one process to use the resource at any given time. Semaphores cannot be used for exchange of data unlike message queues in System V IPC.

You can consider a semaphore as an integer value variable that is used as a resource counter. The value of the variable at any point in time is the number of resource units available to the semaphore. For example, if there is one resource available, the valid semaphore values are zero and one.

To obtain a resource that is controlled by a semaphore, a process needs to test its current value. If the current value is not set by the process, the process must wait until the value is reset, that is wait for some other process to release the resource).

In the System V implementation of semaphores:

  • A semaphore is not a single value but a set of non-negative integer values. These non-negative integer values in the set can be numbered from one to a system defined maximum.

  • Each value in the set can assume any non-negative value, up to a system defined maximum value.

Semaphores

  • Kernel maintains a structure of information for every set of semaphores in the system. For example, the semaphore ID data structure

    structsemid_ds {
         struct ip_perm     sem_perm;     /*operation permission struct*/
         time_t     sem_otime;     /* last semop time*/
         time_t     sem_ctime;     /* last change time*/
         unsigned short int sem_nsems;     /* # of semaphores in set*/
    };
  • Kernel also maintains a internal data structure of information for every member of a semaphore. For example, the semaphore sturcture

    structsem{
         unsigned short int sernval;     /* semaphore value */
         unsigned short int sempid;      /* pid of last operation */
         unsigned short int semncnt;     /* # awaiting sernval > cval */
         unsigned short intsemzcnt;      /* # awaiting semval */

The sem structure is the internal data structure used by the kernel to maintain the set of values for a semaphore. Every member of a semaphore is described by the sem structure explained on the slide.

In addition to maintaining the actual set of values for a semaphore, the kernel maintains three other pieces of information for each value in the set, the process ID of the process that performed the last operation on the value, a count of the number of processes that are waiting for the value to increase, and a count of number of processes that are waiting for the value to become zero.

Semaphores: System Calls

  • There are multiple limits associated with semaphores.

    There is a limit on

    • the number of semaphores that can be used at a time

    • the unique semaphore sets

    • the maximum number of semaphore per semaphore sets

    • the maximum value of any semaphore

  • System calls for semaphore

    • semget - create a semaphore or access an existing semaphore

    • semop - performs operations on the semaphore values in the set

    • semctl - provides various control operations on a semaphore

All system calls used in semaphores are listed below:

semget() System Call

A semaphore is created, or existing semaphore is accessed with the semget() system call. The signature of the call is:

    #include <sys/types.h>
     #include <sys/ipc.h>
     #include <sys/sem.h>
     int semget(key_t key, int nsems, int semflag);

The value returned by the semget() is the semaphore identifier, semid, or –1 if an error occurred.

semop() System Call

Once the semaphore set is opened, operations are performed on one or more semaphore values in the set using the semop() system call. The signature of the call is:

    #include <sys/types.h>
     #include <sys/ipc.h>
     #include <sys/sem.h>
     int semop(int semid, struct msgbuf **opsptr, unsigned int
     nops);

The operations passed to the semop() system call are guaranteed to be carried out automatically by the kernel. The kernel either performs all of the operations that are specified, or it does not perform any of them.

One of the options that can be set using the call is to direct the system not to wait for any operation to be performed. The return value from semop() is zero the operation is successful, or –1 if an error occurred.

semctl() System Call

The semctl() system call provides a variety of control operations on semaphores.

One of the operation for which this call is used for is the deletion of semaphore.

The signature of the call is:

    #include <sys/types.h>
     #include <sys/ipc.h>
     #include <sys/sem.h>
     int semctl(int semid, int semnum, int cmd, union semnum arg);

The same ipc_perm access structure that is described in the Message Queues topic in this section, is used in case of the semaphore for determining the access rights.

LAB 7: Semaphores

  • The lab demonstrates the use of semaphores interface in System V IPC using an application.

  • The application locks a file using semaphores before updating and prevents any concurrent update

  • Multiple instances of the application can update the file without any corruption of the file data.

  • A sequence number is maintained in a file used by the application. Each time the file is accessed the sequence number is incremented by one.

  • The application has following execution steps:

    • The file with the sequence number is read

    • A semaphore is obtained to lock the file before accessing it for update

    • On success of the previous step, the sequence number in the file is incremented

    • The lock is released

    • The above steps are repeated twenty times

Building the File locking application using semaphores

Two source files lockmain.c and subfunctions.c, and an input file seqno should be created for the lab.

  • subfunctions.c implements the common routines for creating semaphores, performing operations on semaphores, removing semaphores, and closing semaphores.

  • lockmain.c is the main lab program that contains the flow.

  • Seqno is a file locked by filelock.

Compile the program using the following steps:

  1. Run the INTERIX Korn shell

  2. Browse to the samples directory

    $cd samples
  3. At the $ prompt, type the following command

    $vi subfunctions.c.
  4. Type the code as given below for subfunctions.c file. You can also copy and paste the code, if it is already coded and available in a Windows text file.

    /* subfunctions.c*/
    #include        <sys/types.h>
    #include        <sys/ipc.h>
    #include        <sys/sem.h>
    #include        <errno.h>
    extern int      errno;
    #define BIGCOUNT        10000           /* initial value of process counter */
    /*
     * Define the semaphore operation arrays for the semop() calls.
     */
    static struct sembuf    op_lock[2] = {
            2, 0, 0,        /* wait for [2] (lock) to equal 0 */
            2, 1, SEM_UNDO  /* then increment [2] to 1 - this locks it */
                            /* UNDO to release the lock if processes exit
                               before explicitly unlocking */
    };
    static struct sembuf    op_endcreate[2] = {
            1, -1, SEM_UNDO,/* decrement [1] (proc counter) with undo on exit */
                            /* UNDO to adjust proc counter if process exits
                               before explicitly calling sem_close() */
            2, -1, SEM_UNDO /* then decrement [2] (lock) back to 0 */
    };
    static struct sembuf    op_open[1] = {
            1, -1, SEM_UNDO /* decrement [1] (proc counter) with undo on exit */
    };
    static struct sembuf    op_close[3] = {
            2, 0, 0,        /* wait for [2] (lock) to equal 0 */
            2, 1, SEM_UNDO, /* then increment [2] to 1 - this locks it */
            1, 1, SEM_UNDO  /* then increment [1] (proc counter) */
    };
    static struct sembuf    op_unlock[1] = {
            2, -1, SEM_UNDO /* decrement [2] (lock) back to 0 */
    };
    static struct sembuf    op_op[1] = {
            0, 99, SEM_UNDO /* decrement or increment [0] with undo on exit */
                            /* 99 is set to the actual amount to add
                               or subtract (positive or negative) */
    };
    /****************************************************************************
     * Create a semaphore with a specified initial value.
    */
    int
    sem_create(key, initval)
    key_t   key;
    int     initval;        /* used if we create the semaphore */
    {
            register int            id, semval;
            union semun {
                    int             val;
                    struct semid_ds *buf;
                    ushort          *array;
            } semctl_arg;
            if (key == IPC_PRIVATE)
                    return(-1);     /* not intended for private semaphores */
            else if (key == (key_t) -1)
                    return(-1);     /* probably an ftok() error by caller */
    again:
            if ( (id = semget(key, 3, 0666 | IPC_CREAT)) < 0)
                    return(-1);     /* permission problem or tables full */
            /*
             * Get a lock on the semaphore by waiting for [2] to equal 0,
             * then increment it.
             */
            if (semop(id, &op_lock[0], 2) < 0) {
                    if (errno == EINVAL)
                            goto again;
                    printf("can't lock");
            }
            /*
             * Get the value of the process counter. If it equals 0,
             * then no one has initialized the semaphore yet.
             */
            if ( (semval = semctl(id, 1, GETVAL, 0)) < 0)
                    printf("can't GETVAL");
            if (semval == 0) {
                    semctl_arg.val = initval;
                    if (semctl(id, 0, SETVAL, semctl_arg) < 0)
                            printf("can SETVAL[0]");
                    semctl_arg.val = BIGCOUNT;
                    if (semctl(id, 1, SETVAL, semctl_arg) < 0)
                            printf("can SETVAL[1]");
            }
            /*
             * Decrement the process counter and then release the lock.
             */
            if (semop(id, &op_endcreate[0], 2) < 0)
                    printf("can't end create");
            return(id);
    }
    /****************************************************************************
     * Open a semaphore that must already exist.
     */
    int
    sem_open(key)
    key_t   key;
    {
            register int    id;
            if (key == IPC_PRIVATE)
                    return(-1);     /* not intended for private semaphores */
            else if (key == (key_t) -1)
                    return(-1);     /* probably an ftok() error by caller */
            if ( (id = semget(key, 3, 0)) < 0)
                    return(-1);     /* doesn't exist, or tables full */
            /*
             * Decrement the process counter.  You do not need a lock
             * to do this.
             */
            if (semop(id, &op_open[0], 1) < 0)
                    printf("can't open");
            return(id);
    
    }
    /****************************************************************************
     * Remove a semaphore.
    */
    sem_rm(id)
    int     id;
    {
            if (semctl(id, 0, IPC_RMID, 0) < 0)
                    printf("can't IPC_RMID");
    }
    /****************************************************************************
     * Close a semaphore.
    */
    sem_close(id)
    int     id;
    {
            register int    semval;
            /*
             * The following semop() first gets a lock on the semaphore,
             * then increments [1] - the process counter.
             */
            if (semop(id, &op_close[0], 3) < 0)
                    printf("can't semop");
            /*
             * Now that you have a lock, read the value of the process
             * counter to see if this is the last reference to the
             * semaphore.
             */
            if ( (semval = semctl(id, 1, GETVAL, 0)) < 0)
                    printf("can't GETVAL");
            if (semval > BIGCOUNT)
                    printf("sem[1] > BIGCOUNT");
            else if (semval == BIGCOUNT)
                    sem_rm(id);
            else
                    if (semop(id, &op_unlock[0], 1) < 0)
                            printf("can't unlock"); /* unlock */
    }
    /****************************************************************************
     * Wait until a semaphore's value is greater than 0, then decrement
     * it by 1 and return.
    */
    sem_wait(id)
    int     id;
    {
            sem_op(id, -1);
    }
    /****************************************************************************
     * Increment a semaphore by 1.
    */
    sem_signal(id)
    int     id;
    {
            sem_op(id, 1);
    }
    /****************************************************************************
     * General semaphore operation. Increment or decrement by a user-specified
     * amount (positive or negative; amount cannot be zero).
     */
    sem_op(id, value)
    int     id;
    int     value;
    {
            if ( (op_op[0].sem_op = value) == 0)
                    printf("can't have value == 0");
            if (semop(id, &op_op[0], 1) < 0)
                    printf("sem_op error");
    }
    
  5. On completion of the code and while in the vi editor, press the ESC key, then the : key (the colon key), followed by wq to save the file and exit from the editor application.

  6. Repeat step 3 through 5 for the client code lockmain.c and seqno.

  7. Compile the subfunctions program.

  8. To compile the subfunctions program, use the following command:

    $gcc -c subfunctions.c
  9. Build the file locking application using semaphores.

    To build the server program, use the following command:

    $gcc –c lockmain.c
    $gcc –o filelock lockmain.o subfunctions.o

    The –c option compiles lockmain.c, but does not link it.

    /* lockmain.c, Locking example using the simpler semaphore operations */
    #include        <sys/types.h>
    #define SEQFILE         "seqno"
    #define SEMKEY          ((key_t) 23456L)
    #define MAXBUFF         100
    main()
    {
            int     fd, i, n, pid, seqno, semid;
            char    buff[MAXBUFF];
            pid = getpid();
            if ( (fd = open(SEQFILE, 2)) < 0)
                    printf("can't open %s", SEQFILE);
            if ( (semid = sem_create(SEMKEY, 1)) < 0)
                    printf("can't open semaphore");
            for (i = 0; i < 20; i++) {
                    sem_wait(semid);                /* get the lock */
                    lseek(fd, 0L, 0);               /* rewind before read */
                    if ( (n = read(fd, buff, MAXBUFF)) <= 0)
                            printf("read error");
                    buff[n] = '\0';         /* null terminate for sscanf */
                    if ( (n = sscanf(buff, "%d\n", &seqno)) != 1)
                            printf("sscanf error");
                    printf("pid = %d, seq# = %d\n", pid, seqno);
                    seqno++;
                    sprintf(buff, "%03d\n", seqno);
                    n = strlen(buff);
                    lseek(fd, 0L, 0);               /* rewind before write */
                    if (write(fd, buff, n) != n)
                            printf("write error");
                    sem_signal(semid);              /* release the lock */
            }
            sem_close(semid);
    }
    

    The seqno file contents are as follows:

    /*seqno – do not add the comment line, 
    just one single line with sequence number as given below*/
  10. Now you can execute the application by typing the output file name and its path as shown below:

    $ ./filelock
  11. If the file seqno exists and the file path is correct, the application displays the following output on the console:

          pid = 27688, seq# = 1
           pid = 27688, seq# = 2
           pid = 27688, seq# = 3
           pid = 27688, seq# = 4
           pid = 27688, seq# = 5
           pid = 27688, seq# = 6
           pid = 27688, seq# = 7
           pid = 27688, seq# = 8
           pid = 27688, seq# = 9
           pid = 27688, seq# = 10
           pid = 27688, seq# = 11
           pid = 27688, seq# = 12
           pid = 27688, seq# = 13
           pid = 27688, seq# = 14
           pid = 27688, seq# = 15
           pid = 27688, seq# = 16
           pid = 27688, seq# = 17
           pid = 27688, seq# = 18
           pid = 27688, seq# = 19
           pid = 27688, seq# = 20
    

    After displaying the above information, the program terminates.

    Note: The value of pid will be different in your case.

  12. You can try executing the two processes at the same time. To execute the two processes at the same time, type:

    $ filelock& filelock&
  13. Verify the sequence number and pid in the output.

Shared Memory

  • Shared memory allows two or more processes to share a memory segment.

  • Shared memory is maintained by kernel

  • Processes that want to use the same shared memory use a unique identifier to identify it

  • There are various limits on the shared memory

    • Maximum size of a shared memory segment

    • Minimum size of the shared memory segment

    • Maximum number of shared memory segment, systemwide

  • Processes using shared memory use semaphores for the purpose of it's synchronization

Shared memory provides a way of letting two or more process to share a memory segment. The shared memory can be accessed by the read() system call. This avoids making another copy of the same data when it is read. Because of this feature of shared memory, it is a faster way of communication between two processes compared to message queues. The message queues require all the information to go through the kernel and a separate buffer to give in the read () call to copy. This is most helpful while dealing with large amount of information flow between the processes.

While one process is reading into the shared memory, the other process must wait for the read to finish before processing the data. The semaphores are used for the purpose of synchronization.

Shared Memory

  • Kernel maintains a structure of information for every shared memory segment. For example, the shared memory structure is:

    struct shmid_ds{
         struct ipc_perm shm_perm;
         int       shm_segsz;
         pid_t     shm_lpid;
         pid_t     shm_cpid;
         shmatt_t  shm_nattch;
         time_t    shm_atime;
         time_t    shm_dtime;
         time_t    shm_ctime; };
  • System calls for shared memory

    • shmget - create or access an existing shared memory

    • shmat - attaches the shared memory segment

    • shmdt - detaches the shared memory segment

    • shmctt - provides various control operations on a shared memory

The above slide describes the structure information for the shared memory segment. The ipc_perm structure as described in the Message Queues topic is used for the access permissions of the shared memory segment.

shmget() System Call

A shared memory segment is created, or an existing one is accessed with the shmget() system call. The signature of the call is:

      #include <sys/types.h>
       #include <sys/ipc.h>
       #include <sys/shm.h>
       int shmget(key_t key, int size, int msgflag);

The value returned by shmget() call is the shared memory identifier, shmid, or –1 if an error occurs. The shmget() call does not provide access to the segment for the calling process.

shmat() System Call

Shared memory segment is attached using shmat() system call. The signature of the call is:

      #include <sys/types.h>
       #include <sys/ipc.h>
       #include <sys/shm.h>
       char* shmat(int shmid, char *shmaddr, int shmflag);

The value returned by shmat() is the starting address of the shared memory segment. For all practical purposes, a call to shmat() with shmaddr of zero lets the kernel select the starting address for shared memory segment. By default, the shared memory is attached for both reading and writing by the calling process.

shmdt() System Call

When a process is finished with a shared memory segment, it detaches the segment using shmdt() system call. The signature of the call is:

      #include <sys/types.h>
       #include <sys/ipc.h>
       #include <sys/shm.h>
       int shmdt(char *shmaddr);

The call to shmdt does not delete the shared memory segment.

shmctl() System Call

To remove a shared memory segment, shmctl() system is used. The signature of the call is:

      #include <sys/types.h>
       #include <sys/ipc.h>
       #include <sys/shm.h>
       int shmctl(int shmid, int cmd, struct shmid_ds *buff);

The variable cmd with a value IPC_RMID removes a shared memory segment from the system.

LAB 8: Shared Memory

  • A simple client-server application demonstrates the shared memory interface in System V IPC

    • The application involves the client and the server programs using a shared memory segments to read and write a file.

    • The client reads the filename from the standard input and writes it to the shared memory

    • The server gets the filename from the shared memory.

      To access the shared memory segment it locks it using a semaphore.

    • The server reads the content of the file and writes it on to the shared memory segment.

    • Meanwhile, the client waits on the semaphore till it is released by the server.

    • The client then reads the data from the shared memory segment and writes it to the standard output.

Building the client-server application using shared memory

There are three source files shmsvr.c, shmcli.c, and subfunctions.c.

  • subfunctions.c implements the common routines for creating semaphores, performing operations on semaphores, removing semaphores, and closing semaphores.

  • shmsvr.c is the server application.

  • shmcli.c is the client application.

Compile the following program using the following steps:

  1. Run the INTERIX Korn shell.

  2. Browse to the samples directory.

           $cd samples
  3. At the $ prompt, type the following command

           $vi shmsvr.c.
  4. Type the code as given below for shmsvr.c file. You can also copy and paste the code, if it is already coded and available in a Windows text file.

    /* shmsvr.c */
    #include        <stdio.h>
    #include        <sys/types.h>
    #include        <sys/ipc.h>
    #include        <sys/shm.h>
    /*
     * Definition of "our" message.
     */
    #define MAXMESGDATA     (4096-16)
    #define MESGHDRSIZE     (sizeof(Mesg) - MAXMESGDATA)
    typedef struct {
      int   mesg_len;       /* #bytes in mesg_data, can be 0 or > 0 */
      long  mesg_type;      /* message type, must be > 0 */
      char  mesg_data[MAXMESGDATA];
    } Mesg;
    #define NBUFF   4       /* number of buffers in shared memory */
                            /* (for multiple buffer version) */
    #define SHMKEY  ((key_t) 7890) /* base value for shmem key */
    #define SEMKEY1 ((key_t) 7891) /* client semaphore key */
    #define SEMKEY2 ((key_t) 7892) /* server semaphore key */
    #define PERMS   0666
    int     shmid, clisem, servsem; /* shared memory and semaphore IDs */
    Mesg    *mesgptr;               /* ptr to message structure, which is
                                       in the shared memory segment */
    main()
    {
            /*
             * Create the shared memory segment, if required,
             * then attach it.
             */
            if ( (shmid = shmget(SHMKEY, sizeof(Mesg), PERMS | IPC_CREAT)) < 0)
                    printf("server: can't get shared memory");
            if ( (mesgptr = (Mesg *) shmat(shmid, (char *) 0, 0)) == (Mesg *) -1)
                    printf("server: can't attach shared memory");
            /*
             * Create two semaphores.  The client semaphore starts out at 1
             * since the client process starts things going.
             */
            if ( (clisem = sem_create(SEMKEY1, 1)) < 0)
                    printf("server: can't create client semaphore");
            if ( (servsem = sem_create(SEMKEY2, 0)) < 0)
                    printf("server: can't create server semaphore");
            server();
            /*
             * Detach the shared memory segment and close the semaphores.
             */
            if (shmdt(mesgptr) < 0)
                    printf("server: can't detach shared memory");
            sem_close(clisem);
            sem_close(servsem);
            exit(0);
    }
    server()
    {
            int     n, filefd;
            char    errmesg[256];
            /*
             * Wait for the client to write the filename into shared memory.
             */
            sem_wait(servsem);      /* we'll wait here for client to start things */
            mesgptr->mesg_data[mesgptr->mesg_len] = '\0';
                                            /* null terminate filename */
            if ( (filefd = open(mesgptr->mesg_data, 0)) < 0) {
                    /*
                     * Error.  Format an error message and send it back
                     * to the client.
                     */
                    strcpy(errmesg, " can't open file");
                    strcat(mesgptr->mesg_data, errmesg);
                    mesgptr->mesg_len = strlen(mesgptr->mesg_data);
                    sem_signal(clisem);             /* send to client */
                    sem_wait(servsem);              /* wait for client to process */
            } else {
                    /*
                     * Read the data from the file right into shared memory.
                     */
                    while ( (n = read(filefd, mesgptr->mesg_data,
                                                            MAXMESGDATA-1)) > 0) {
                            mesgptr->mesg_len = n;
                            sem_signal(clisem);     /* send to client */
                            sem_wait(servsem);      /* wait for client to process */
                    }
                    close(filefd);
                    if (n < 0)
                            printf("server: read error");
            }
            /*
             * Send a message with a length of 0 to signify the end.
             */
            mesgptr->mesg_len = 0;
            sem_signal(clisem);
    }
    
  5. On the completion of the code and while in the vi editor, press the ESC key, then the colon key (:) key followed by wq to save the file and exit from the editor.

  6. Repeat step 3 through 5 for the client code shmcli.c and subfunctions.c

  7. Compile the subfunctions program.

    To compile the subfunctions program, you should use the following command format:

         $gcc -c subfunctions.c
    /* subfunctions.c*/

    This file is present under the section "Semaphores" in System V IPC and is already compiled

  8. Build the server program.

    To build the server program, you should use the following command format:

        $gcc –c shmsvr.c
         $gcc –o shmserver shmsvr.o subfunctions.o

    The "–c" option compiles shmsvr.c but does not link it.

  9. Build the client program.

    To build the client program, you should use the following command format:

        $gcc –c shmcli.c
         $gcc –o shmclient shmcli.o subfunctions.o

    The "–c" option compiles shmcli.c but does not link it.

    /* shmcli.c*/
    #include        <stdio.h>
    #include        <sys/types.h>
    #include        <sys/ipc.h>
    #include        <sys/shm.h>
    /*
     * Definition of "our" message.
    */
    #define MAXMESGDATA     (4096-16)
    #define MESGHDRSIZE     (sizeof(Mesg) - MAXMESGDATA)
    typedef struct {
      int   mesg_len;       /* #bytes in mesg_data, can be 0 or > 0 */
      long  mesg_type;      /* message type, must be > 0 */
      char  mesg_data[MAXMESGDATA];
    } Mesg;
    #define NBUFF   4       /* number of buffers in shared memory */
    #define SHMKEY  ((key_t) 7890) /* base value for shmem key */
    #define SEMKEY1 ((key_t) 7891) /* client semaphore key */
    #define SEMKEY2 ((key_t) 7892) /* server semaphore key */
    #define PERMS   0666
    int     shmid, clisem, servsem; /* shared memory and semaphore IDs */
    Mesg    *mesgptr;               /* ptr to message structure, which is
                                       in the shared memory segment */
    main()
    {
            /*
             * Get the shared memory segment and attach it.
             */
            if ( (shmid = shmget(SHMKEY, sizeof(Mesg), 0)) < 0)
                    printf("client: can't get shared memory segment");
            if ( (mesgptr = (Mesg *) shmat(shmid, (char *) 0, 0)) == (Mesg *) -1)
                    printf("client: can't attach shared memory segment");
            /*
             * Open the two semaphores.  The server must have
             * created them already.
             */
            if ( (clisem = sem_open(SEMKEY1)) < 0)
                    printf("client: can't open client semaphore");
            if ( (servsem = sem_open(SEMKEY2)) < 0)
                    printf("client: can't open server semaphore");
            client();
            /*
             * Detach and remove the shared memory segment and
             * close the semaphores.
             */
            if (shmdt(mesgptr) < 0)
                    printf("client: can't detach shared memory");
            if (shmctl(shmid, IPC_RMID, (struct shmid_ds *) 0) < 0)
                    printf("client: can't remove shared memory");
            sem_close(clisem);      /* will remove the semaphore */
            sem_close(servsem);     /* will remove the semaphore */
            exit(0);
    }
    client()
    {
            int     n;
            /*
             * Read the filename from standard input, write it to shared memory.
             */
            sem_wait(clisem);               /* get control of shared memory */
            if (fgets(mesgptr->mesg_data, MAXMESGDATA, stdin) == NULL)
                    printf("filename read error");
            n = strlen(mesgptr->mesg_data);
            if (mesgptr->mesg_data[n-1] == '\n')
                    n--;                    /* ignore newline from fgets() */
            mesgptr->mesg_len = n;
            sem_signal(servsem);            /* wake up server */
            /*
             * Wait for the server to place something in shared memory.
             */
            sem_wait(clisem);               /* wait for server to process */
            while( (n = mesgptr->mesg_len) > 0) {
                    if (write(1, mesgptr->mesg_data, n) != n)
                            printf("data write error");
                    sem_signal(servsem);    /* wake up server */
                    sem_wait(clisem);       /* wait for server to process */
            }
            if (n < 0)
                    printf("data read error");
    }
  10. Now you can execute the server application by typing the output file name and its path. Remember to execute the server program first.

        $ ./shmserver &
         The server application is run in background
  11. Create a file named test.txt under the current directory. The test file should contain the following two lines:

        This is a test file.
         Testing shared memory in System V IPC
  12. Now execute the client application with the following command:

         $ ./shmclient <ENTER>

    Now the client application waits for the file name as the input from the user. Specify the path of the file

         test.txt

    Because the file exists and the file path is correct, the client process displays the following message on the console:

        This is a test file.
         Testing shared memory in system V IPC

    After displaying the above message, the client and the server programs terminate.

BSD Strings and Memory Functions

  • Strings and Memory functions

    • INTERIX supports extensive set of string and memory functions. These functions are defined in <string.h> header file.

    • Strings and memory functions can be broadly classified under the following categories

      • Copying and concatenation

      • String Length

      • Sting/Array Comparison

      • Collation Functions

      • Search Functions

      • Finding tokens in a string

Operations on strings (or arrays of characters) are an important part of many programs. INTERIX library provides an extensive set of string utility functions, including functions for copying, concatenating, comparing, and searching strings. The memory functions operate on arbitrary regions of storage; for example, the memcpy function can be used to copy the contents of any kind of array.

Copying and Concatenation Functions

  • The following are the functions provided by INTERIX for the purpose of copying and concatenation

    • strcpy (char *to, const char *from) - This copies characters from the string from into the string to

    • strncpy (char *to, const char *from, size_t size) - This function is similar to strcpy but always copies exactly size characters into to

    • memcpy (void *to, const void from, size_t size) - The memcpy function copies size bytes from the object beginning at from into the object beginning at to.

    • Memmove (void *to, const void *from, size_t size) - memmove copies the size bytes at from into the size bytes at to, even if those two blocks of space overlap

    • memccpy (void *to, const void *from, int c, size_t size) - This function copies no more than size bytes from from to to, stopping if a byte matching c is found.

You can use the functions described in this section to copy the contents of strings and arrays, or to append the contents of one string to another. The str and mem functions are declared in the header file string.h.

A helpful way to remember the ordering of the arguments to the functions in this section is that it corresponds to an assignment expression, with the destination array specified to the left of the source array. All of these functions return the address of the destination array.

strcpy Function

          char * strcpy (char *to, const char *from)

This function copies characters from the string from (up to and including the terminating null character) into the string to. Like memcpy, this function has undefined results if the strings overlap. The return value is the value of to.

For example,

          strcpy (to, "hello");

After string copy using strcpy, to contains: [hello\0]

strncpy Function

          char * strncpy (char *to, const char *from, size_t size)

This function is similar to strcpy but always copies exact size characters into to. If the length of from is less than size, then strncpy copies all of from, followed by enough null characters to add up to size characters in all. This behavior is rarely useful, but it is specified by the ISO C standard. The behavior of strncpy is undefined if the strings overlap.

Using strncpy as opposed to strcpy is a way to avoid bugs relating to writing past the end of the allocated space for to. However, it can also make your program much slower in one common case: copying a string, which is probably small into a potentially large buffer. In this case, size may be large, and when it is, strncpy will waste a considerable amount of time copying null characters. For example,

          strncpy (to, "hello", SIZE);

After the above statement execution of strncpy, to will contain: [hello\0\0\0\0\0].

This is due to the length of from string (5) being smaller than the size specified for copying, i.e., 10. If the length of from is more than size, then strncpy copies just the first size number of characters. Note that in this case there is no null terminator written into to.

For example,

          strncpy (to, "hello world", SIZE);

After the above statement execution of strncpy, to will contain: [hello worl].

memcpy Function

     void * memcpy (void *to, const void *from, size_t size)

The memcpy function copies size bytes from the object beginning at from into the object beginning at to. The behavior of this function is undefined if the two arrays, to and from, overlap. Use memmove instead of memcpy, if overlapping is possible. The value returned by memcpy is the value of to. Here is an example of how you might use memcpy to copy the contents of an array:

          pTo = memcpy (to, "hello" , 6 * sizeof (char));

After the above statement execution of memcpy, first six character of to will contain: [hello\0]. Also, pTo will point to to.

memmove Function

     void * memmove (void *to, const void *from, size_t size)

memmove copies the size bytes at from into the size bytes at to, even if those two blocks of space overlap. In the case of overlap, memmove copies the original values of the bytes in the block at from, including those bytes which also belong to the block at to. For example,

     pTo = memmove (to, to , 6 * sizeof (char));

After the above statement execution of memmove, first six character of to will still contain: [hello\0].

memccpy Function

    void * memccpy (void *to, const void *from, int c, size_t
     size)

This function copies no more than size bytes from from to to, stopping if a byte matching c is found. The return value is a pointer into to one byte past where c was copied, or a null pointer if no byte matching c appeared in the first size bytes of from.

For example,

          pTo = memccpy (to, "hello" , 'l', 6 * sizeof (char));

After the above statement execution of memccpy, to will contain: [hel].

     Also, pTo = (to + 3)

Copying and Concatenation Functions

  • memset (void *block, int c, size_t size) - This function copies the value of c (converted to an unsigned char) into each of the first size bytes of the object beginning at block

  • strdup (const char *s) - This function copies the null-terminated string s into a newly allocated string and returns a pointer to the new string. The string is allocated using malloc

  • strcat (char *to, const char *from) - This function is similar to strcpy, except that the characters from from are concatenated or appended to the end of to, instead of overwriting it.

  • strncat (char *to, const char *from, size_t size) - This function is like strcat except that not more than size characters from from are appended to the end of to

  • bcopy (void *from, const void *to, size_t size) - This is a partially obsolete alternative for memmove, derived from BSD.

  • bzero (void *block, size_t size) - This is a partially obsolete alternative for memset

memset Function

     void * memset (void *block, int c, size_t size)

This function copies the value of c (converted to an unsigned char) into each of the first size bytes of the object beginning at block. It returns the value of block.

For example,

          pTo = memset (to, '\0', 6 * sizeof (char));

After the above statement execution of memset, first six characters of to will contain: [\0\0\0\0\0\0].

strdup Function

     char * strdup (const char *s)

This function copies the null-terminated string s into a newly allocated string. The string is allocated using malloc. If malloc cannot allocate space for the new string, strdup returns a null pointer. Otherwise, it returns a pointer to the new string.

For example,

          pTo = strdup ("hello")

strcat Function

     char * strcat (char *to, const char *from)

The strcat function is similar to strcpy, except that the characters from from are concatenated or appended to the end of to, instead of overwriting it. That is, the first character from from overwrites the null character marking the end of to.

For example,

    to = "hello"
     pTo = strcat (to, "world");

After the above statement execution of strcat, pTo will contain: [helloworld].

strncat Function

     char * strncat (char *to, const char *from, size_t size)

This function is like strcat except that not more than size characters from from are appended to the end of to. A single null character is also always appended to to, so the total allocated size of to must be at least size + 1 bytes longer than its initial length. For example,

    to = "hello"
     pTo = strncat (to, ", world", 4);

After the above statement execution of strncat, pTo will contain: [hello, wo].

bcopy Function

     void * bcopy (void *from, const void *to, size_t size)

This is a partially obsolete alternative for memmove, derived from BSD. Note that it is not quite equivalent to memmove, because the arguments are not in the same order.

bzero Function

     void * bzero (void *block, size_t size)

This is a partially obsolete alternative for memset, derived from BSD. Note that it is not as general as memset, because the only value it can store is zero.

String Length functions

  • The following are the functions provided by INTERIX for the purpose of determining the length of strings

    • strlen (const char *s) - The strlen function returns the length of the null-terminated string s in bytes (In other words, it returns the offset of the terminating null character within the array.)

You can get the length of a string using the strlen function. This function is declared in the header file string.h.

     size_t strlen (const char *s)

The strlen function returns the length of the null-terminated string s in bytes. (In other words, it returns the offset of the terminating null character within the array.) For example,

    to = "helloworld"
     strlen (to);

The output of the above statement will be 10.

String/Array Comparison Functions

  • String comparison functions

    • The following are the functions provided by INTERIX for comparing strings

      • memcmp (const void *a1, const void *a2, size_t size) - The function memcmp compares the size bytes of memory beginning at at against the size bytes of memory beginning at a2.

      • strcmp (const char *s1, const char *s2) - The strcmp function compares the string s1 against s2, returning a value that has the same sign as the difference between the first differing pair of characters

      • strncmp (const char *s1, const char *s2, size_t size) - This function is the similar to strcmp, except that no more than size wide characters are compared.

      • strcasecmp (const char *s1, const char *s2) - This function is like strcmp, except that differences in case are ignored

      • strncasecmp (const char *s1, const char *s2, size_t n) - This function is like strncmp, except that differences in case are ignored

You can use the functions in this topic to perform comparisons on the contents of strings and arrays.

Unlike most comparison operations in C, the string comparison functions return a nonzero value if the strings are not equivalent rather than if they are. The sign of the value indicates the relative ordering of the first characters in the strings that are not equivalent: a negative value indicates that the first string is "less" than the second, while a positive value indicates that the first string is "greater".

The most common use of these functions is to check only for equality. All of these functions are declared in the header file string.h.

memcmp Function

     int memcmp (const void *a1, const void *a2, size_t size)

The function memcmp compares the size bytes of memory beginning at a1 against the size bytes of memory beginning at a2. The value returned has the same sign as the difference between the first differing pair of bytes (interpreted as unsigned char objects, then promoted to int). If the contents of the two blocks are equal, memcmp returns 0.

On arbitrary arrays, the memcmp function is mostly useful for testing equality. It usually is not meaningful to do byte-wise ordering comparisons on arrays of things other than bytes. For example, a byte-wise comparison on the bytes that make up floating-point numbers is not likely to indicate about the relationship between the values of the floating-point numbers.

You should also be careful about using memcmp to compare objects that can contain "holes", such as the padding inserted into structure objects to enforce alignment requirements, extra space at the end of unions, and extra characters at the ends of strings whose length is less than their allocated size. The contents of these "holes" are indeterminate and may cause strange behavior when performing byte-wise comparisons. For more predictable results, perform an explicit component-wise comparison.

strcmp Function

     int strcmp (const char *s1, const char *s2)

The strcmp function compares the string s1 against s2, returning a value that has the same sign as the difference between the first differing pair of characters interpreted as unsigned char objects, then promoted to int. If the two strings are equal, strcmp returns 0. A consequence of the ordering used by strcmp is that if s1 is an initial substring of s2, then s1 is considered to be "less than" s2.

strcmp does not take sorting conventions of the language the strings are written in into account. To do that you have to use strcoll.

For example,

     strcmp ("hello", "hello")

The two strings are the same and the output of the above statement will be 0.

     trcmp ("hello", "Hello")

The comparisons are case-sensitive and the output of the above statement will be 32.

strcasecmp Function

     int strcasecmp (const char *s1, const char *s2)

This function is like strcmp, except that differences in case are ignored. How uppercase and lowercase characters are related is determined by the currently selected locale. In the standard "C" locale the characters @"A and @"a do not match but in a locale which regards these characters as parts of the alphabet they do match.

strncmp Function

     int strncmp (const char *s1, const char *s2, size_t size)

This function is the similar to strcmp, except that no more than size wide characters are compared. In other words, if the two strings are the same in their first size wide characters, the return value is zero. For example,

     strncmp ("hello", "hello, world", 5)

The initial 5 characters are the same and the output will be '0'

strncasecmp Function

     int strncasecmp (const char *s1, const char *s2, size_t n)

This function is like strncmp, except that differences in case are ignored. Like strcasecmp, it is locale dependent on how uppercase and lowercase characters are related.

Collation Functions

  • The following are the collation functions for strings provided by INTERIX

    • strcoll (const char *s1, const char *s2) - The strcoll function is similar to strcmp but uses the collating sequence of the current locale for collation (the LC_COLLATE locale).

    • strxfrm (char *restrict to, const char *restrict from, size_t size) - The function strxfrm transforms the string from using the collation transformation determined by the locale currently selected for collation, and stores the transformed string in the array to. Up to size characters (including a terminating null character) are stored.

In some locales, the conventions for lexicographic ordering differ from the strict numeric ordering of character codes. For example, in Spanish most glyphs with diacritical marks, such as accents are not considered distinct letters for the purposes of collation. On the other hand, the two-character sequence 'll' is treated as a single letter that is collated immediately after 'l'.

Effectively, the way these functions work is by applying a mapping to transform the characters in a string to a byte sequence that represents the string's position in the collating sequence of the current locale. Comparing two such byte sequences in a simple fashion is equivalent to comparing the strings with the locale's collating sequence.

You can use the functions strcoll and strxfrm (declared in the headers file string.h).

strcoll Function

     int strcoll (const char *s1, const char *s2)

The strcoll function is similar to strcmp but uses the collating sequence of the current locale for collation (the LC_COLLATE locale).

strxfrm Function

     size_t strxfrm (char *restrict to, const char *restrict from, size_t size)

The function strxfrm transforms the string from using the collation transformation determined by the locale currently selected for collation, and stores the transformed string in the array to. Up to size characters (including a terminating null character) are stored.

The return value is the length of the entire transformed string. This value is not affected by the value of size, but if it is greater or equal than size, it means that the transformed string did not entirely fit in the array to. In this case, only as much of the string as actually fits was stored. To get the whole transformed string, call strxfrm again with a bigger output array.

The transformed string may be longer than the original string, and it may also be shorter. If size is zero, no characters are stored in to. In this case, strxfrm simply returns the number of characters that would be the length of the transformed string. This is useful for determining what size the allocated array should be.

Search Functions

  • The following are the search functions for strings provided by INTERIX

    • memchr (const void *block, int c, size_t size) - This function finds the first occurrence of the byte c (converted to an unsigned char) in the initial size bytes of the object beginning at block

    • strchr (const char *string, in c) - The strchr function finds the first occurrence of the character c (converted to a char) in the null-terminated string beginning at string

    • strrchr (const char *string, int c) - The function strrchr is like strchr, except that it searches backwards from the end of the string string (instead of forwards from the front)

    • strstr (const char *haystack, const char *needle) - This is like strchr, except that it searches haystack for a substring needle rather than just a single character

    • strcasestr (const char *haystack, const char *needle) - This is like strstr, except that it ignores case in searching for the full substring. Like strcasecmp, it is locale dependent how uppercase and lowercase characters are related

This topic describes library functions, which perform various kinds of searching operations on strings and arrays. These functions are declared in the header file string.h.

memchr Function

     void * memchr (const void *block, int c, size_t size)

This function finds the first occurrence of the byte c (converted to an unsigned char) in the initial size bytes of the object beginning at block. The return value is a pointer to the located byte, or a null pointer if no match was found.

strchr Function

     char * strchr (const char *string, int c)

The strchr function finds the first occurrence of the character c (converted to a char) in the null-terminated string beginning at string. The return value is a pointer to the located character, or a null pointer, if no match was found.

For example,

     strchr ("hello, world", 'l')

The output of the above statement will be "llo, world".

The terminating null character is considered to be part of the string, so you can use this function get a pointer to the end of a string by specifying a null character as the value of the c argument.

strrchr Function

     char * strrchr (const char *string, int c)

The function strrchr is like strchr, except that it searches backwards from the end of the string string (instead of forwards from the front). For example,

     strrchr ("hello, world", 'l')
    

The output of the above statement will be "ld".

strstr Function

     char * strstr (const char *haystack, const char *needle)

This function is like strchr, except that it searches haystack for a substring needle rather than just a single character. It returns a pointer into the string haystack that is the first character of the substring, or a null pointer if no match was found. If needle is an empty string, the function returns haystack. For example,

     strstr ("hello, world", "l")
    

The output of the above statement will be "llo, world".

strcasestr Function

     char * strcasestr (const char *haystack, const char *needle)

This function is like strstr, except that it ignores case in searching for the substring. Like strcasecmp, it is locale dependent on how uppercase and lowercase characters are related. For example,

     strcasestr ("hello, world", "L")

The output of the above statement will be "llo, world".

Search Functions

  • strspn (const char *string, const char *skipset) - The strspn ("string span") function returns the length of the initial substring of string that consists entirely of characters that are members of the set specified by the string skipset

  • strcspn (const char *string, const char *stopset) - The strcspn ("string complement span") function returns the length of the intial substring of string that consists entirely of characters that are not members of the set specified by the string stopset.

  • Strpbrk (const char *string, const char *stopset) - The strpbrk ("string pointer break") function is related to strcspn, except that it returns a pointer to the first character in string that is a member of the set stopset instead of the length of the initial substring

strspn Function

     size_t strspn (const char *string, const char *skipset)

The strspn ("string span") function returns the length of the initial substring of string that consists entirely of characters that are members of the set specified by the string skipset. The order of the characters in skipset is not important.

For example,

     strspn ("hello, world", "abcdefghijklmnopqrstuvwxyz")

The output of the above statement will be 5.

Note that "character" is here used in the sense of byte. In a string using a multibyte character encoding (abstract), character that consists of more than one byte are not treated as an entity. Each byte is treated separately. The function is not locale dependent.

strcspn Function

     size_t strcspn (const char *string, const char *stopset)

The strcspn ("string complement span") function returns the length of the initial substring of string that consists entirely of characters that are not members of the set specified by the string stopset. (In other words, it returns the offset of the first character in string that is a member of the set stopset.)

For example,

     strcspn ("hello, world", " \t\n,.;!?")

The output of the above statement will be 5.

Note that "character" is here used in the sense of byte. In a string using a multibyte character encoding (abstract), character that consists of more than one byte are not treated as an entity. Each byte is treated separately. The function is not locale-dependent.

strpbrk Function

     char * strpbrk (const char *string, const char *stopset)

The strpbrk ("string pointer break") function is related to strcspn, except that it returns a pointer to the first character in string that is a member of the set stopset instead of the length of the initial substring. It returns a null pointer if no such character from stopset is found.

For example,

     strpbrk ("hello, world", " \t\n,.;!?")

The output of the above statement will be ", world".

Note that "character" is here used in the sense of byte. In a string using a multibyte character encoding (abstract), character consisting of more than one byte are not treated as an entity. Each byte is treated separately. The function is not locale-dependent.

Finding Tokens in a String

  • The following are the functions provided by the INTERIX, for finding the tokens in a string

    • strtok (char *restrict newstring, const char *restrict delimiters) - A string can be split into tokens making a series of calls to the function strtok. The string to be split up is passed as the newstring argument on the first call only

    • strsep (char **string_ptr, const char *delimiter) - This function has a similar functionality as strtok_r with the newstring argument replaced by the save_ptr argument. The initialization of the moving pointer has to be done by the user

It is fairly common for programs to have a need to do some simple kinds of lexical analysis and parsing, such as splitting a command string up into tokens. You can do this with the strtok function, declared in the header file string.h.

strtok Function

     char * strtok (char *restrict newstring, const char *restrict delimiters)

A string can be split into tokens by making a series of calls to the function strtok.

The string to be split up is passed as the newstring argument on the first call only. The strtok function uses this to set up some internal state information. Subsequent calls to get additional tokens from the same string are indicated by passing a null pointer as the newstring argument. Calling strtok with another non-null newstring argument reinitializes the state information. It is guaranteed that no other library function ever calls strtok without your knowledge (which would mess up this internal state information).

The delimiters argument is a string that specifies a set of delimiters that may surround the token being extracted. All the initial characters that are members of this set are discarded. The first character that is not a member of this set of delimiters marks the beginning of the next token. The end of the token is found by looking for the next character that is a member of the delimiter set. This character in the original string newstring is overwritten by a null character, and the pointer to the beginning of the token in newstring is returned.

On the next call to strtok, the searching begins at the next character beyond the one that marked the end of the previous token. Note that the set of delimiters delimiters do not have to be the same on every call in a series of calls to strtok.

If the end of the string newstring is reached, or if the remainder of string consists only of delimiter characters, strtok returns a null pointer.

Note that "character" is here used in the sense of byte. In a string using a multibyte character encoding (abstract), character consisting of more than one byte are not treated as an entity. Each byte is treated separately. The function is not locale-dependent.

For example,

    const char string[] = "words separated by spaces -- and, punctuation!";
     const char delimiters[] = " .,;:!-";
     char *token, *cp;
     cp = strdup (string);                       /* Make writable copy.  */
     token = strtok (cp, delimiters);         /* token => "words" */
     token = strtok (NULL, delimiters);    /* token => "separated" */
     token = strtok (NULL, delimiters);    /* token => "by" */
     token = strtok (NULL, delimiters);    /* token => "spaces" */
     token = strtok (NULL, delimiters);    /* token => "and" */
     token = strtok (NULL, delimiters);    /* token => "punctuation" */
     token = strtok (NULL, delimiters);    /* token => NULL */

strsep Function

     char * strsep (char **string_ptr, const char *delimiter)

This function has a similar functionality as strtok_r with the newstring argument replaced by the save_ptr argument. The initialization of the moving pointer has to be done by the user. Successive calls to strsep move the pointer along the tokens separated by delimiter, returning the address of the next token and updating string_ptr to point to the beginning of the next token.

One difference between strsep and strtok_r is that if the input string contains more than one character from delimiter in a row strsep returns an empty string for each pair of characters from delimiter. This means that a program normally should test for strsep returning an empty string before processing it.

This function was introduced in 4.3BSD and therefore is widely available.

For example,

    const char string[] = "words separated by spaces -- and, punctuation!";
     const char delimiters[] = " .,;:!-";
     char *running;
     char *token;
     running = strdup(string);
     token = strsep (&running, delimiters);    /* token => "words" */
     token = strsep (&running, delimiters);    /* token => "separated" */
     token = strsep (&running, delimiters);    /* token => "by" */
     token = strsep (&running, delimiters);    /* token => "spaces" */
     token = strsep (&running, delimiters);    /* token => "" */
     token = strsep (&running, delimiters);    /* token => "" */
     token = strsep (&running, delimiters);    /* token => "" */
     token = strsep (&running, delimiters);    /* token => "and" */
     token = strsep (&running, delimiters);    /* token => "" */
     token = strsep (&running, delimiters);    /* token => "punctuation" */
     token = strsep (&running, delimiters);    /* token => "" */
     token = strsep (&running, delimiters);    /* token => NULL */

Functions not Supported by INTERIX

  • Some of the functions supported on UNIX system, which are not supported by INTERIX are listed below

    • strnlen (const char *s, size_t maxlen) - The strnlen funcitons returns the length of the string s in bytes if this length is smaller than maxlen bytes. Otherwise it returns maxlen.

      INTERIX does not define strncpy function, therefore the function equivalent could be (strlen (s) <n ? strlen (s) : maxlen)

    • char *strndup (const char *s, size_t size) - This function is similar to strdup but always copies at most size characters into the newly allocated string.

      • stpcpy (char *restrict to, const char *restrict from) - This function is like strcpy, except that it returns a pointer to the end fo the string to (that is, the address of the terminating null character to + strlen (from)) rather than the beginning.

      • memmem (const void *haystack, size_t haystack-len, const void *needle, size_t needle-len) - This is like strstr, but needle and haystack are byte arrays rather than null-terminated strings. needle-len is the length of needle and haystack-lenis the length of haystack

Refer to the above slide for information about functions that are supported by Unix but not by INTERIX.

Was this page helpful?
(1500 characters remaining)
Thank you for your feedback
Show:
© 2014 Microsoft