'Application_developing'에 해당되는 글 48건

2007/10/24 놓치기 쉬운개념들

2007/10/24 Java Web Start 강좌

2007/10/16 [MFC] header & library 설명

2007/10/16 RPC(Remote Procedure Call)

2007/08/09 초보자들이 반드시 지켜야 할 C언어 코딩 기법 c/c++ (1)

2007/08/03 알고리즘

2007/08/03 선택알고리즘 (1)

2007/07/24 The Java Virtual Machine

2007/07/22 UML의 기초: Unified Modeling Language 소개

2007/07/20 세마포어란 무엇인가

Application_developing/Java 2007/10/24 12:06

놓치기 쉬운개념들

1. 클래스의 호출 순서

제목 : 자바에 대한 개념을 흔들어 놓는 소스다...

db책은 시간이 없어서 그냥 와우북에서 보고 영진껄루 사서 보았다....전체적인 개념잡기는 쉬운데...책 두께에 비해 깊은 내용이 없더군....하여간 엽기적인 소스를 만난 덕에 또 이렇게 글을 띠운다....

class A {
static
{
System.out.println("A클래스 스태틱 로딩");

}
A() {
System.out.println("A()");
}
}

class ClassInitTest {
static int monthNum=12;
static int monthDays[];
static
{
monthDays = new int[monthNum];
monthDays[ 0] = 31; monthDays[ 1] = 28;
monthDays[ 2] = 31; monthDays[ 3] = 30;
monthDays[ 4] = 31; monthDays[ 5] = 30;
monthDays[ 6] = 31; monthDays[ 7] = 31;
monthDays[ 8] = 30; monthDays[ 9] = 31;
monthDays[10] = 30; monthDays[11] = 31;

System.out.println("static int monthDays[];...");
}

A a;
{
a = new A();
System.out.println("A a;");
}
ClassInitTest() {
this(0);
System.out.println("ClassInitTest()");
}
ClassInitTest(int x) {
System.out.println("ClassInitTest(int x)");
}

public static void main(String args[]) {
new ClassInitTest(); //내생각엔 여기서 생성자를 호출해서 저기 아래로 이동해야 될 것 같은데 a객체를 생성하면서 a클래스로 가버리네...이해할 수 없다...5분만 투자하거라...늘그막이 고생하는 친구를 위해...(엉아라는 표현을 이젠 자제한다)

System.out.println("new ClassInitTest();");
}
}

실행한 결과다....
static int monthDays[];...
A클래스 스태틱 로딩
A()
A a;
ClassInitTest(int x)
ClassInitTest()
new ClassInitTest();

답변: 자바 버츄얼머신 스펙을 참조...

덕분에 썬 사이트에 가서 자바 버츄얼머신 스펙을 꼼꼼히 읽어보았다.
클래스를 초기화 할때 즉 생성하면 생성자가 제일 먼저
불리우는 것이 아니었다.
간단하게 말을 하겠다.

그래야 잊지 않고 나도 기억하기 쉬우니까.
비록 정확한 대답은 아니라고 해도 전혀 틀린 것도 아니니까.

이건 만은 기억하자.

1. static을 초기화한다.
2. field를 초기화한다.
3. 생성자를 호출한다.

그러니까 소스에서는 ClassInitTest를 생성하면
1. 이 클래스내의 static을 초기화한다.
출력: static int monthDays[];...

2. 다음은 필드를 초기화한다.
필드 A a; 가 있으므로 클래스 A를 생성할 것이다.
이 클래스에도 static이 있기 때문에 이를 초기화한다.
출력: A클래스 스태틱 로딩

3. 그리고, A() 생성자를 호출한다.
출력:A()

4. 다음행 실행
출력:A a

5. 다음은 ClassInitTest의 생성자를 호출.
생성자에서 this(0)이라는 인자 있는 생성자를 호출하므로
출력:ClassInitTest(int x)

6. 다음행 샐행
출력:ClassInitTest()

7. 다시 다음행 실행
출력:new ClassInitTest()

아휴~ 힘들다. 1000자 넘기지 않으려고 말을 짧게 썻다.
이해해라.

2. 텍스트 필드에 저장할 수 있는 문자열의 최대 길이

먼저 질문부터 하겠습니다.

Form의 method방식을 get으로 지정했을 경우에 전달할 수 있는 최대 문자열의 길이는 ?

HTTP 데이터 전송 방식에는 get과 post 방식이 있습니다. get방식은 URL에 "이름=값"의 형식으로 데이터를 전송합니다. 쉽게 설명하자면 우리가 많이 사용하고 있는 인터넷 익스플로러를 잘 관찰하시면 다음과 같은 주소를 보시게 될 겁니다.

folderselect=639096&rand=9935556036482

이름과 값을 위와 같은 형식으로 서버에 전달하는 것입니다. 이렇게 get 방식으로 전달할 수 있는 스트링의 길이는 도대체 얼마나 될까 알아보았습니다.

알파벳과 한글 모두 2047문자까지 들어가더군요. 하지만 엄밀히 따지면 앞에 주소도 있고 기다 경로를 제외하면 조금 작아지겠지요. 그런데 실제 데이터를 get 방식으로 전달해 보면 그 보다 훨씬 미치지 못하는 길이만이 넘아갔습니다. 한글로 대략 340문자까지만 전송이 되었습니다[필자가 해보니깐 정말 변수명을 어떻게 하느냐에 따라 틀리고 사파리는 한글이 안된다;;;; 머 어쩌라구]. (+- 오차는 10문자 정도) 이를 post 방식으로 다시 전송하였더니 2000자 이상이 거뜬이 전송되었습니다.

이로써 get 방식과 post 방식의 또 다른 차이점을 알았습니다. 님들도 많은 양의 데이터를 전송할 때는 post 방식을 사용하세요. 무심코 get방식을 사용한다면, 왜 데이터가 제대로 전송이 되지 않는지 그 이유를 찾는데 한참을 보내야 할 겁니다.

3. 각종 데이터 타입의 최대값

MS Access의 MEMO 데이터 형 : Up to 65,535 characters
유닉스의 IPC중 메시지큐 : 메시지의 최개크기 (4096), 큐의 최대 크기(16384)

자바 자료형의 최소/최대값

유형	크기	최소값	최대값
byte	8bit	-128	127
short	16bit	-32768	32767
int	32bit	-2147483648(20억)	2147483647(20억)
long	64bit	-9223372036854775808	9223372036854775807

4. 네트워크 바이트 순서

바이트 오더(byte order)에는 두 가지가 있다. little endian과 big endian이다.

little endian : 작은 수를 끝에 놓는다.
big endian : 큰 수를 끝에 놓는다.

little endian은 56이란 수를 56으로 표현하고, big endian은 56을 65로 나타낼 수 있다는 말이다.

네트워크 상의 byte order은 big endian이다. 가령 little endian 머신에서 little endian머신으로 데이터를 전송한다면 문제가 없다 거꾸로 big endian에서 big endian으로도 마찬가지다. 문제는 다른 방식의 두 머신이 통신을 할 때이다.

일단 네트워크로 데이터를 전송할 때는 네트워크 바이트 순서로 바꿔준 다음 데이터를 수신한 쪽에서 자신의 머신에 맞게 순서를 다시 조정하면 된다. 소켓에선 이런 일을 도와주는 함수가 있으며 htonl, htons, ntohl, ntohs와 같은 것들이 바로 이런 일들을 한다.

참고로 SUN, IBM들은 big endian형식이고 Intel은 little 엔디안 형식이라고 한다. 왜 이렇게 두 가지 방법이 있어서 번거롭게 하는지 모르겠다. 한가지밖에 없다면 아무 신경 쓰지 않고 통신을 하면 될텐데...

5. JAVA 기초 문법 중에서

같은 파일 안에 public class는 단 하나만 존재할 수 있으며 클래스는 파일이름과 같아야 한다.
float 형의 유효 데이터는 소수점이 있는 숫자로 끝에 F 또는 f가 붙은 숫자이어야 한다.
float f = 1.3 -> ( X )
float f = 1.3f -> ( O )
소수점만 있을 경우에는 double형으로 인식된다.

같은 클래스내에서 메소드를 overloading할 때
리턴타입은 상관없이 메소드의 인자만 다르면 된다.

class Test
{
boolean method( int arg ) { return true; }

boolean method( long arg ) { return true; }

boolean method( int arg1, int arg2 ) { return true; }

void method( ) { } //신기하지만 자바에서는 이것도 가능하다.

// 컴파일 에러 발생( 메소드 이름과 인자가 같으면 안된다. )
void method( int arg ) { return true; }
}

헷갈리는 overloading과 overriding 구분하기

"overloading : 짐을 너무 많이 싣다. "
( 같은 클래스에 이름이 같은 메소드를 여러번 정의할 수 있다. )

class Test
{
void method( int arg ) {} // load

void method( int arg1, int arg2 ) {} // load ( overload )
}

"overriding : 짓밟다. 무시하다. "
( 부모로부터 상속받은 메소드를 재정의할 수 있다. )

class SuperClass
{
void method( int arg ) { System.out.println( "SuperClass : " + arg ) }
}

class SubClass extends SuperClass
{
void method( int arg ) { System.out.println( "SubClass : " + arg ) }
}

이 글에는 트랙백을 보낼 수 없습니다

Application_developing/Java 2007/10/24 11:22

Java Web Start 강좌

I. 자바 웹스타트 소개
-------------------------------

자바 웹스타트는 2001년 초에 소개된 기술로 한번 클릭으로 자바 애플리케이션을 쉽게
수행 시킬 수 있는 환경으로 자바가 추가하던 사상 즉 "언제 어디서나 같은 코드로서
쉽게 접근하고 강력한 기능을 발휘하는 언어" 라는 강점을 유감 없이 발휘할 수 있는
기술이다.

자바 웹스타트는 기존의 웹 환경의 한계점을 뛰어넘어 네트웍으로 자유롭게 풍부하고
강력한 응용프로그램을 실행 시킬 수 있으며, 버젼 및 배포 관리를 자동으로 수행하는
JNLP(Java Network Lanunching Protocol)을 기반으로 하여 관리함으로서 기존의 클라이언트
윈도우즈 프로그램의 문제점을 극복하고 새로운 파라다임을 제시한다.

자바 웹스타트의 장점을 요약하면 다음과 같다.

1. 바탕화면이나 웹화면의 아이큰을 단 한번 클릭함으로서 자동 다운로드 및 캐쉬를 사용한
응용 프로그램이 구동되어 빠르고 쉽게 자바 응용 프로그램을 실행할 수 있다.

2. Linux, Mac, Windows 등 다양한 플랫폼에서 동일한 User Interface로 실행됨으로
OS로 부터 독립적인 환경을 제공한다.

3. 웹 환경의 단순한 HTML 및 JavaScript로 구현되던 User Interface를 보다 강력한 자바
UI 기술인 AWT/Swing을 이용하여 구현할 수 있다.

4. 응용프로그램이 자동적으로 캐쉬되고 업데이트 됨으로 첫번째 구동 이후에는 매우
빠른 속도록 응용프로그램이 시작된다.

5. Java 2의 보안 모델 Signed Application 및 "sand box" 모델이 적용됨으로
사용자들은 보다 안전한 환경에서 강력한 UI를 사용할 수 있다.

자바 웹스타트를 좀더 잘 이해 하기 위하여는 Client/Server 아키텍쳐 맥락에서 Client를
크게 Thin Client (가벼운 클라이언트) 와 Rich Client(강력한 기능의 클라이언트)로 나누어 보고
이 둘의 장단점을 비교해보면 서로의 성격을 좀더 확실히 이해할 수 있다.

Thin Client란 Web Browser와 같이 이미 사용자 환경에 Client가 설치되어 있고, 서버로부터
가벼운 HTML과 같은 화면 정보를 받아 화면을 매번 그려주는 체계를 말한다.

- 가벼운 클라이언트 환경 (web browser) 장점

. Browser 만 있으면 언제 어디서나 가볍게 화면을 접속하여 볼 수 있다.
. 화면 자체를 간단히 HTML 로 구성할 수 있다.
. Hyper link 를 이용하여 화면 Navigation을 할 수 있다.
. SessionLess 처리를 이용하여 짧은 트렌젝션을 많이 처리하는데 적합하다.
. 문서 등을 보여주는 (Browsing) 하는데 효과적이다.

- 가벼운 클라이언트 환경 (web browser) 단점

. HTML의 한계 때문에 복잡한 UI 또는 기능성 있는 입출력 구조를 처리하기 어렵다.
. JavaScript 나 Dynamic HTML은 Language 자체가 유동성이 많고
환경 오류의 가능성이 높다.
. JavaScript 또는 Dynamic HTML은 완전한 Language 또는 객체 지향적 Language가
아니라 유지 보수성 및 생산성이 떨어진다.
. 전송된 자료가 쉽게 Browser를 통하여 노출되기 때문에 보안성이 떨어진다.
. HTML의 한계 때문에 다량의 자료 처리가 쉽지 않다.
. 옵션 및 환경이 Browser에 의존적이기 때문에 사용자 환경마다 처리될 때도 있고
처리가 않 될 수도 있다.
. 트렌젝션 마다 Image + HTML + Data 가 함께 내려 옮으로써 처리 속도가 저하된다.

Rich Client(강력한 기능의 클라이언트) 란 기존의 C/S 시스템 또는 전용 Browser 나
Windows 프로그램과 같이 특정 업무에 맞도록 화면을 개발하고, 메뉴, 아이콘, 테이블, 트리,
드래그드랍등 다양하고 강력한 사용자 화면과 인터페이스를 구성하는 것이다.

Browser와 같이 이미지 디자인 같은 작업이 별도로 필요 없는 대신에 Excel 처럼 필드 위주의
강력한 기능을 구현할 수 있는 것이 특징이다.

- Rich Client의 장점

. 강력하고 기능성 있는 GUI 를 제공한다. (Spread Sheet, DeskTop, Tree, Drag and Drop 등등)
. Browser에 의존하지 않고 개인 PC의 화면에서 바로 실행 가능하다.
. Remote 환경 뿐만 아니라 Local 환경, DB, Network 처리, File 처리 등이 자유롭다.
. 서버와 데이타 만 주고 받기 때문에 속도가 빠르다.
. 다량의 데이타를 local 과 서버 작업으로 분산 처리가 가능하기 때문에 다량 처리에 적합하다.
. Browser 의존적이 아니기 때문에 PC 자체의 환경 문제가 적다.
. 객체 지향적 모델로 GUI를 만들기 때문에 유지 보수성이 뛰어 나다.
. 전송되는 자료가 쉽게 노출되지 않으므로 보안성이 우수하다.

- Rich 클라이언트의 단점

. 프로그램 설치가 필요하여 초심자가 설치할 시에 오류가능성이 있다.
. 설치된 프로그램이 다른 요인으로 손상될 수 있다.
. 반드시 모듈이 설치된 PC에서만 작동한다.
. 프로그램 변경 또는 Version Up이 발생하면 다시 배포해야 한다. (Version 문제 가능성)
. 대부분 MicroSoft Windows 환경이기 때문에 Windows 버젼 및 환경 때문에
오류 발생 가능성이 있다.

Java Web Start는 Thin Client와 Rich Client의 장점을 동시에 수용하고 각각의 단점을 서로
보안할 수 있도록 통합시킨 기술이다.

Java Web Start 기술을 채용하면 Thin Client 처럼 어디서나 쉽게 접근하여 가볍게 사용할 수
있을 뿐만 아니라 Rich Client와 같은 강력한 기능을 구현할 수 있어 웹 기술과 C/S 기술을
동시에 채용하여 비용 대비 성능이 가장 우수하다.

- Java Web Start 는 Thin Client 의 장점과 Rich Client의 장점을 통합한 기술이다.

. web 환경과 같이 언제 어디에서든 손쉽게 사용할 수 있다. (아이콘 클릭으로 시작)
. plug-in 이 자동으로 설치되며 일단 설치되면 다시 설치할 필요 없다. (웹으로 시작 가능)
. Application 모듈이 local 에 cache 되어 자동으로 버젼을 체크하여,
download 시간이 대폭 축소된다.
. C/S 환경이 갖고있는 강력하고 다양한 GUI 기술이 Swing을 통하여 구현되며,
독립 데스크 탑 환경으로 갈 수 있다.
. Web Version 이나 Client Server 버전을 따로 개발할 필요가 없다. Web Start 가
두개의 환경을 단일 코드로 지원한다.
. JavaScript, DHTML 등에 의존하지 않고, 독립환경의 단일 기술로 이루어짐으로서
오류 가능성이 적다.
. 결국 손쉽게 접근할 수 있는 Thin Client 의 장점과 강력한 GUI 및 작업 기능의
Rich Client 의 장점을 동시에 구현하는 것이다.

Java Web Start 환경적 특징을 살펴보면 다음과 같다.

- Java Web Start 환경은 새로운 기술이라기 보다는 Java 기술의 최종 완성된 형태이다.

- Web Version 이나 Client Server 버전을 따로 개발할 필요가 없다.

- JVM Version 1.4는 자동으로 Install 된다.

- XML, Security, Messaging 등 최신 기술 Component 가 같이 포함되어 있다. (미래가 있다.)

- Web Start 는 자동으로 Caching 을 하여 빠르다.

- Web Browser 나 MS Windows 환경에 독립적이다.

- 따로 Desktop 환경을 구성할 수 있다.

II. 웹스타트 실행하기
-----------------------

자바 웹스타트가 어떤것인가를 가장 직관적으로 알 수 있는 방법은 실제로
웹스타트로된 시스템을 실행해보는 것이다. 웹스타트는 앞에서도 설명했듯이
언제 어느 곳에서나 쉽게 실행해 볼 수 있다는 장점을 가지고 있다. 따라서
여러분들이 지금 일반적인 네트웍 환경과 PC 환경을 가지고 있다면 직접
웹스타트 시스템을 실행해 보는 것이 자바 웹스타트를 가장 빨리 이해하는
지름길 이다.

본 강좌에서는 자바 웹스타트로 된 시스템을 실행해보기 위하여 JComtech에서
개발한 "JDMS"를 예제로 하겠다. JDMS는 통신 판매(Direct Marketing) 회사들을
위한 판매관리 ASP 시스템으로, 현재 몇개 회사가 JDMS를 공동으로 사용하고 있다.
상용 시스템임으로 테스트 사용시 무리한 트렌젝션을 발생시키지 않기를 바란다.

JDMS에 접근하려면 우선 웹브라우져에서 http://jdms.jcomtech.co.kr로 가서
홈페이지를 방문한다. 홈페이지에는 JDMS에 관련된 소개 및 사용법, 사용자 게시판
으로 구성되어 있다. 이러한 내용들은 업무관련 내용임으로 간단히 읽고 참고하기
바란다.

자바 웹스타트로 JDMS를 실행하려면 홈페이지 좌측 상단에서 "Application"이라는
메뉴를 찾아 클릭한다. 이 메뉴를 클릭하면 우선 사용자 브라우져에 JRE 1.4 버젼이
설치되어 있는가를 검사한다. 만일 설치가 되있지 않다면 자동으로 Plug-in을 설치
하도록 되어 있다. 만일 JRE 1.4를 설치하겠냐고 묻는 화면이 나오면 "예"를 눌러
설치하도록 한다. 설치하는데는 다소 시간이 걸린다. 하지면 처음 실행시 한번만
실행하면 된다.

자바 웹스타트는 자바 1.2 버젼부터 소개되었지만 1.4 버젼에서는 설치 및 운용이
자동으로 되기 때문에 더욱편리하다. 따라서 본 강좌는 자바 1.4 버젼을 전제로
진행 하겠다.

JRE 1.4가 모두 설치되었다면 브라우져 화면에 자바 로고와 "JDMS Web start [시작]"
라는 메뉴가 나온다. 이를 클릭하면 자바 웹스타트 로고가 나오면서 프로그램이
설치되어 있는 웹서버로부터 자바 클레스 화일들을 다운로드 받는 화면들이 진행된다.
처음 실행할 때는 웹서버로부터 JDMS를 실행하기위하여 필요한 모든 화일들을 다운로드
받겠지만 다음번 부터는 버젼이 변경된 부분만 다운로드 받기 때문에 속도가 매우
빨라진다.

자바 프로그램 다운로드가 성공적으로 이루어지면 즉시 프로그램이 실행되면서 JDMS
로그인 화면이 나오게 된다. 로그인 아이디와 암호는 jcom/jcom으로 한다. 아이디와
암호가 맞으면 스윙으로 구성되 단독 윈도우가 실행되며, 드디어 JDMS 시스템 화면이
나오게 된다. 화면에 있는 메뉴들을 한번씩 실행해보며 시스템이 어떻게 실행되는지
경험해보기 바란다. 구체적인 JDMS은 업무 구성이나 사용법은 본 강좌에서 생략하도록
한다.

JDMS를 한번 둘러보았다면 화면을 종료하고 브라우져도 닫기 바란다. 그러한 후에
여러분 PC에서 윈도우 바탕화면에서 Java Web Start 아이콘을 찾아보기 바란다.
정상적인 경우라면 JRE 1.4를 설치하면서 바탕화면에 "Java Web Start" 아이콘을 만든다.
이 아이콘은 웹스타트 설치를 관리할 수 있는 "Java Web Start 응용프로그램 관리자"를
실행하는 아이콘이다.

Java Web Start 응용프로그램 관리자에는 PC에서 실행된 적이 있는 응용프로그램과
SUN에서 제공하는 몇가지 예제 프로그램을 실행할 수 있는 아이콘이 있어 웹브라우져를
통하지 않고도 직접 자바 프로그램을 실행할 수 있다. 웹브라우져에서 실행한 적이
있는 JDMS로 아이콘으로 직접 실행할 수 있을 것이다. Java Web Start 응용프로그램
관리자를 이용하지 않고 직접 JDMS를 바탕화면에 등록하여 사용할 수도 있는데 이는
프로그램 실행시 등록 여부를 물어서 아이콘 등록을 하게된다.

이렇게 JRE 1.4 Plug-in을 설치하고 JDMS를 실행하였다면, 여러분은 자바 응용프로그램
시스템인 JDMS를 웹페이지에서도 실행할 수 있고, 직접 바탕화면에서도 실행할 수 있게
된다. 그러고 처음 실행할 때문 프로그램 다운로드 시간이 걸리겠지만 두번째 부터는
프로그램이 빠르게 실행된다는 것을 느낄 수 있을 것이다. 프로그램 Caching 및 버젼
관리는 웝스타트 환경이 자동으로 해주기 때문이다.

JDMS를 다시 웹에서 실행해보고 또 다시 바탕화면에서 번갈아 실행해봄으로서 사용자들이
어떠한 환경에서 사용하게되고 어떠한 장접이 있는지 직관적으로 알게될것이다.

개발자 입장에서 이러한 환경을 잘 분석해보면 응용 프로그램은 자바가 제공하는 AWT 및
Swing으로 만든 윈도우 환경이라는 것을 알게될 것이다. 결국 웹스타트 환경에서 응용
프로그램을 잘 개발하려면 자바 순수 프로그램 및 자바 UI 프로그램(AWT/Swing)기술이
기본적으로 있어야 하면 웹스타트의 서버 환경은 웹서버 설치후 자바 클레스를 묶은 jar
화일및 jnlp 프로토콜 mine 타입을 지정하면 된다. 이러한 서버 설정 방법은 이 후에
다루도록 한다.

III. 간단한 프로그램 만들기 
------------------------------

자바 웹스타트 환경을 만들려면 우선 그 환경에서 실행될 응용 프로그램이 필요하다.

비교적 간단한 프로그램으로 자바 웹스타트의 다양한 측면을 테스트 하기 위하여 자바 Swing으로

간단한 편집기를 만들어 보겠다. 간단한 편집기의 클래스 명은 SimpleEditor로서 다음과 같은

화면으로 구성되어 있다.

[소스보기 - 파일첨부 참조] 소스를 살펴보면 우선 buildGUI() 메소드에서

메뉴와 화면을 만들며 그리고 이벤트 처리를 위한 리슨너 등을 추가하고 있다.

주요 기능 및 메소드를 살펴보면 다음과 같다.

- buildGUI() 화면의 필드와 메뉴를 만들며, 이벤트 처리가 가능하도록 각각의 컴퍼넌트에

ActionListener 및 DocumentListener를 추가하고 있다.

- checkUpdate() 테스트 필드에 변경이 있으면 이를 화일에 저장할지를 묻고,

사용자가 "예"를 선택하면 이를 화일에 저장한다.

그리고 "취소" 버튼을 누르면 취소를 true 값으로 리턴한다.

- newFile() 화면을 clear하고 상태를 초기 상태로 바꾸어 새로운 화일을 편집할 수 있도록 한다.

- open() FileChooser를 이용하여 기존의 테스트 화일을 열고 이를 테스트 필드에 보여주는 
         기능을 한다.

- close() 편집하던 화일을 닫는 기능을 수행한다.

- save() 편집하던 화일을 저장하는 기능을 수행한다. 만일 화일명이 없으면 FileChooser를 
         이용하여 화일명을 지정하도록 한다.

- exit() 편집하던 화일이 있으면 저장하고 편집기 프로그램을 종료한다.

 위의 메소드들은 각각의 메뉴 선택시에 actionPerformed 메소드에서 실행되면

현재 화일 문서가 변경되었는가의 여부는 DocumentListener를 사용하여 이벤트 처리한다.

이 프로그램을 실행하려면 SimpleEditor.class 화일을 JRE가 설치된 PC에 복사하고

단지 "Java -classpath . SimpleEditor" 라는 명령어를 치면 실행되어 아무런 문제 없이

 Local Computing 환경에서 사용할 수 있다. 하지만 다음 문제는 이 프로그램을 여러사람이

 자바 웹스타트를 이용하여 네트웍 환경으로 여러사람이 사용할 수 있게하는 것이다.

이러한 환경은 응용 프로그램에 추가적인 고려 사항이 필요하며 또 자동 배포를 수행할 수 있는

 웹 서버가 필요한데 자세한 내용은 본 SimpleEditor 프로그램을 예제로 다음 장에 계속하여

 설명하겠다.

IV. 웹스타트 응용프로그램
-------------------------

기본적으로 웹스타트 응용프로그램은 SimpleEditor에서 보았듯이 Java2 개발 방법을 사용하며 public static void mina(String[] arg)의 메인 함수에 의하여 실행된다. 하지만 이러한 응용프로그램을 웹스타트와 같이 자동 다운로드 환경에서 사용하려면 보안 및 공유 환경때문에 다음과 같이 몇가지 고려해야 할 사항이 있다.

- 자바 class화일들은 다운로드가 용이할 수 있도록 모두 JAR 화일로 묶어야 한다.

- 그림 화일이나 추가적인 화일 및 자원을 사용한다면 이 모든 것을 JAR로 묶어야 한다.

- SandBox 보안 모델을 사용함으로 다음과 같이 Local 시스템 자원을 사용하지 못한다.

. local disk 및 native libarary를 사용하지 못한다.
. Network 자원은 자신이 download된 서버로부터 뿐이 사용하지 못한다.
. Security Manager을 설치할 수 없으며, 시스테 프로퍼티를 제한적으로 사용한다.

- 이러한 보안상 제약점을 넘으려면 signed JAR 화일을 사용해야 한다.

그럼 SimpleEditor.class를 이러한 사항을 고려하여 배포 준비를 해보도록한다.

우선 SimpleEditor.class를 다음과 같은 JAR 커멘드를 이용하여 .jar 화일로 묶는다.

-----------------------------------
prompt> jar cvf editor.jar *.class
-----------------------------------

이렇게 하면 배포할 수 있는 editor.jar 화일이 생성된다. 만일 클레스 화일 또는 이미지,
리소스 등 화일이 여러개라면 jar 커멘드를 이용하여 같은 화일에 묶도록한다.

SimpleEditor 프로그램은 화일을 읽거나 저장하기 위하여 local 디스크를 접근한다.
하지만 일반 jar 화일로는 SandBox 보안 모델 때문에 local 디스크를 접근할 수 없다.
따라서 local disk에 접근할 수 있는 권한을 부여하기 위하여 다음과 같이 jar 화일에 keytool 및 jarsigner를 이용하여 인증서 sign을 할 필요가 있다. (keytool 및 jarsigner는 JDK1.2 이상 버젼에 포함되어있는 툴이다.)

1. 다음과 같이 keytool 커멘드를 이용하여 teststore 화일에 test 라는 별명으로 새로운 키를 만든다.

----------------------------------------------------------
prompt> keytool -genkey -keystore teststore -alias test

keystore 암호를 입력하십시오: test12345
이름과 성을 입력하십시오.
[Unknown]: editor
조직 단위 이름을 입력하십시오.
[Unknown]: editor
조직 이름을 입력하십시오.
[Unknown]: editor
구/군/시 이름을 입력하십시오?
[Unknown]: editor
시/도 이름을 입력하십시오.
[Unknown]: editor
이 조직의 두 자리 국가 코드를 입력하십시오.
[Unknown]: editor
CN=editor, OU=editor, O=editor, L=editor, ST=editor, C=editor이(가) 맞습니까?
[아니오]: y

<test>에 대한 키 암호를 입력하십시오
(keystore 암호와 같은 경우 RETURN을 누르십시오): test
키 암호가 너무 짧습니다. 여섯 글자 이상이어야 합니다.
<test>에 대한 키 암호를 입력하십시오
(keystore 암호와 같은 경우 RETURN을 누르십시오): test12345
--------------------------------------------------------------------

2. 생성된 keystore 화일과 그 화일에 생성된 키를 이용하여 자체 signed 인증서를 생성한다

----------------------------------------------------------
prompt> keytool -selfcert -alias test -keystore teststore

keystore 암호를 입력하십시오: test12345
----------------------------------------------------------

3. keystore 화일 teststore에 저장된 키 정보를 보려면 다음과 같이 명령을 주고 암호를 입력한다.

------------------------------------------
prompt> keytool -list -keystore teststore
------------------------------------------

4. 최종적으로 만들어진 인증서를 이용하여 다음과 같이 jarsigner를 이용하여 jar 화일에 sign한다.
암호를 물으면 teststore 생성시 사용된 암호를 입력한다.

------------------------------------------------------
prompt> jarsigner -keystore teststore editor.jar test
------------------------------------------------------

이렇게서 만들어진 signed-jar 화일인 editor.jar는 최종적으로 자바웹스타트를 통해서 배포할 수 있는 화일이다. 웹스타트를 통하여 배포되는 화일이 local 자원이나 network 자원을 이용하려면 모두 이렇게 signed-jar를 만들어야 한다.

이렇게 JAR화일을 준비하면 응용프로그램은 배포 준비가 완료된 것이다. 하지만 이렇게 응용 프로그램만 준비한다고 해서 모든 과정이 끝난 것은 아니다.

우선 이 프로그램이 어떤 것이고 어떻게 배포될지를 기술하는 XML 문법으로 이루어진 jnlp 화일이 필요하고, jar와 jnlp을 적절이 설정된 웹서버에 올리는 과정이 필요한다.

이러한 내용은 다음 장에서 다루도록 한다.

V. JNLP 화일 및 웹서버 설정
-----------------------------

JNLP(Java Network Launching Protocol) 화일은 자바 웹스타트에서 자동 배포될 모듈에대한
정보 및 어떻게 배포될 것인 가에 대한 정보를 기술해주는 화일로 웹서버에서는 우선 이
화일부터 호출 되고 이 화일에 기술된 대로 JAR 화일을 다운로드하여 프로그램이 실행되도록
되어 있다.

JNLP 화일은 기본적으로 XML 문법 체계를 가지고 있으며, 모든 테그에 대한 정보를 자세히
보려면 SUN site에서 JNLP Spec v1.0을 참고하도록 한다. 여기서는 JNLP에대한 기본적인 사항을
예제 위주로 살펴보겠다.

다음은 SimpleEditor.jar를 배포할 수 있도록 만들어진 simple.jnlp 화일의 내용이다.

=================================
file : simple.jnlp
=================================

<?xml version="1.0" encoding="utf-8" ?>
<jnlp spec="1.0+" codebase="http://www.javanuri.com/jaws/apps" href="simple.jnlp">

<information>
<title>Simple Editor Demo</title>
<vendor>JavaNuri</vendor>
<description>Simple Editor Demo Program</description>
<description kind="short"> Simple Editor</description>
<icon href="images/editor.jpg" />
<offline-allowed />
</information>

<security>
<all-permissions />
</security>

<resources>
<j2se version="1.4" />
<jar href="editor.jar" />
</resources>

<application-desc main-class="SimpleEditor" />

</jnlp>

본 simple.jnlp 예제에서 첫라인은 본 문서가 XML이고 encoding이 utf8로 되어 있음을
기술하는 것이고 다음부터 나오는 JNLP 주요 태그 내용은 다음과 같다.

- jnlp tag : <jnlp spec="1.0+" codebase="http://www.javanuri.com/jaws/apps" href="simple.jnlp">
본 내용이 jnlp에 관련된 내용임을 기술하며 문서의 끝에 </jnlp>로 닫힌다.

. spec 속성은 jnlp 1.0 버젼 이상임을 정의한다.
. codebase 속성은 본 문서의 모든 화일의 기본 URL을 정의한다.
. href 는 본문서 화일 이름을 정의한다.

- information tag : <information> ... </information>
각종 하위 태그로 본 프로그램에 관한 제목, 제작사, 설명, 아이콘 등을 기술하고 있으며,
이는 배포 및 실행될 때 사용자가 화면을 통해 볼 수 있는 아이콘 및 설명으로 사용된다.

- security tag : <security> ... </security>
이 태그는 프로그램에 실행될때 주는 local 자원에 대한 권한을 기술한다.

- resource tag : <resource>... </resource>
이 태그는 하위 태그를 이용하여 j2se 버젼 및 사용되는 jar 화일들을 기술하여
실행환경에 대한 정보를 담고 있다.

- application-desc tag : <application-desc main-class="SimpleEditor" />
이 태그는 플로그램 실행시 필요한 사항을 기술하며 특히 main-class 는 반드시
지정되어야할 속성으로 자바 프로그램의 main 메소드가 있는 클래스명을 써준다.

여러분이 여기까지해서 JAR 및 JNLP 화일을 만들었다면 이를 적절하게 배포해줄
웹서버 환경이 필요하다. 자바 웹스타트는 기본적으로 HTTP 프로토콜을 사용하며 클래스가
실려있는 JAR화일 및 이 내용을 기술하고 있는 JNLP화일은 결국 웹서버의 서비스 디렉토리에
복사되어야 한다.

웹서버는 다양한 종류가 있으며, 기본적인 환경설정은 해당 웹서버 제품의 메뉴얼에
따라야할 것이다.

본 강좌에서는 가장 많이 쓰이고 있는 Apache 웹서버를 예제로하여 환경을 설정할 것이다.
다른 웹서버도 크게 차이는 없으리라 생각된다.

따라서 www.apache.org 사이트를 방문하여 windows, linux, solaris 등 여러분의 환경에 맞는
apache웹서버를 다운로드하여 메뉴얼이 지시하는데로 설치하도록 한다.
본 강좌에서는 웹서버 설치에 관해서 다루지는 않겠다.

웹서버를 설치한 후에 가장 먼저 해주어야 할일은 jnlp 확장자 화일에 관한 mime type의
설정이다.

mime type의 설정은 웹서버로 하여금 jnlp 확정자를 갖은 화일은 어떠한 컨테츠로 취급할
것인지를 알려준다.
아파치 웹서버의 경우는 설정 디렉토리에서 mime.types 화일에 다음과 같이 한 라인만
추가해 주면 된다.

application/x-java-jnlp-file JNLP

추가한 후에는 반드시 웹서버를 다시 시작해야 한다.

이렇게 웹서버 설정이 끝났다면 지금까지 만들었던 jar 화일과 jnlp 화일을 웹서버의
document root 에 적절히 복사해놓는 것이다. 이것은 마치 웹서버에 html 화일을 복사해놓는
것과 마찬가지이다. jar 및 jnlp 화일과 함께 복사하여야 할 것은 simple.jnlp에서
지정한 아이콘 화일인 images/editor.gif이다. (첨부 File 이용)

아래에 첨부된 file을 다운로드하여 images 디렉토리에 복사하기 바란다.

simple.jnlp 화일은 경우는 반드시 jnlp 테스의 codebase 속성에 기술되어 있는 주소를
반드시 여러분의 웹서버 설치 주소로 변경해야한다. 그렇지 않으면 다운로드시 오류가 발생할 것이다.

여기까지 하면 SimpleEditor를 자바 웹스타트를 이용하여 실행할 준비가 모두 끝난 것이다.
실제 실행 및 실행시 유의 사항등은 다음 장에서 다루도록 한다.

VI. 예제프로그램 웹스타트로 실행하기
---------------------------------------

웹스타트 응용프로그램 개발 과정을 다시한번 정리해보면 다음과 같다.

1. 실행될 응용프로그램을 AWT/Swing으로 개발한다.
2. 개발된 응용프로그램을 배포할 수 있도록 jar로 묶는다.
3. Local 자원을 사용할 수 있도록 jar에 sign을 하여 signed-jar를 만든다.
4. 배포 실행될 응용프로그램에 대한 내용을 jnlp 화일을 만들어 xml로 기술한다.
5. web 서버에 JNLP mime type을 설정한다.
6. web 서버의 적절한 디렉토리에 singed-jar, jnlp 화일 및 이미지 화일을 복사한다.

여기까지의 사항이 완료되었다면 일단 웹브라우져에서 웹스타트 응용프로그램을
실행할 수 있는 웹서버 환경이 갖추어진 것이다.

실제 예제로 만들어진 editor.jar 와 simple.jnlp를 브라우져에서 실행하려면
웹서버를 실행하고 단순히 다음과 같이 simple.jnlp를 호출하는 URL을 치면 된다.

http://www.javanuri.com/jaws/apps/simple.jnlp

브라우져에 jre1.4가 정상적으로 설치되었으면 먼저 "Web Start Logo"화면이 뜨고
보안을 위해 jar에 sign된 내용이 나타나면서 승인 여부를 묻는 화면이 나온다.
여기에서 "예"를 선택하면 즉시 editor 프로그램이 실행될 것이다.
그리고 두번째로 웹에서 실행할 때는 바탕화면에 아이콘 생성 여부를 물을 것이다.
여기에서 "예"를 선택하면 바탕화면에 아이콘이 생성되면서 다음 부터는 웹브라우져를
통하지 않고 바탕에 있는 아이콘을 단순히 클릭함으로써 응용프로그램이 시작될 것이다.

이제까지 만들어왔던 SimpleEditor 예제에 대한 관련 jar, jnlp, jpg 화일들은
example.zip에 묶어 첨부되어 있다. 이를 다운로드 받아 압축을 풀고 웹서버에 설치하고
실행시켜보기 바란다. 단 실행시 jnlp화일에 있는 codeBase 속성은 여러분의 웹서버 주소에
맞도록 수정해야 한다.

서버 설정은 이제까지 설정한데로 하면 보통은 이상없이 환경이 설정된다. 하지만
클라이언트 즉 브라우져 환경은 사용자에 따라 웹스타트가 설정되어 있을 수도 있고, 아닐수
도 있다. 따라서 이러한 브라우져 환경을 통제하려면 html 및 javascript를 이용할 필요가
있다.

사용자 브라우져를 체크해 웹스타트를 실행할 수 있는 방법은 여러가지가 있을 수 있지만
대체로 다음 세가지 방법이 있다.

1. 자바스크립트를 이용해 웹스타트 설치여부를 체크한 다음 설치돼지 않았을 경우
설치하라는 메시지 링크를 주는 방법

2. 웹스타트 버젼1.2의 자동 설치 기능을 이용하는 방법

3. 간단한 JRE1.4를 이용한 Applet을 이용하는 방법


첫번째 방법인 자바웹스타트 설치여부를 체크하는 방법은 우선 웹브라우져 종류를 파악해야 하는데,
이방법은 "Java Web Start Developers's Guide"에서 사용한 자바스크립트인 xbDetectBrowser()를 사용하여
할 수 있다. xbDetectBroser.js는 본 글에 첨부 하였다.

다음 예제는 "Java Web Start Developers's Guide"에 나오는 예제를 응용하여 이해하기 쉽게 변경한
것으로서 스크립트를 실행시키면 Java Web Start 가 설치되었을 경우는 해당 프로그램을 실행할 수 있는
URL 링크가 나오고, 설치되어 있지 않을 경우는 Java Web Start를 설치하라는 메시지가 나오는 예제이다.

Simple1.html
=====================================================================================================
<HTML>
<HEAD><TITLE>Java Web Start Page</title>

<SCRIPT LANGUAGE="JavaScript" SRC="xbDetectBrowser.js"></SCRIPT>


<SCRIPT LANGUAGE="JavaScript">

var javawsInstalled = 0;
isIE = "false";

if (navigator.mimeTypes && navigator.mimeTypes.length) {
x = navigator.mimeTypes['application/x-java-jnlp-file'];
if (x) javawsInstalled = 1;
} else {
isIE = "true";
}

</SCRIPT>



<SCRIPT LANGUAGE="VBScript">
on error resume next
If isIE = "true" Then
If Not(IsObject(CreateObject("JavaWebStart.IsInstalled"))) Then
javawsInstalled = 0
Else
javawsInstalled = 1
End If
End If
</SCRIPT>



<SCRIPT LANGUAGE="JavaScript">
function insertLink(url, name) {
if (javawsInstalled || navigator.family == 'gecko') {
document.write("<a href=" + url + ">" + name + "</a>");
} else {
document.write("Java Web Start 설치가 필요합니다.");
}
}
</SCRIPT>



<SCRIPT LANGUAGE="Javascript">
insertLink("http://www.javanuri.com/jaws/apps/simple.jnlp", "웹스타트 응용프로그램 시작");
</SCRIPT>

<BODY>
</BODY>
</HTML>

=====================================================================================================

두번째 방법인 자동 설치 기능을 이용하는 방법은 썬의 자바 사이트에서 제공하는 Web Start자동 설치
기능을 수행하는 ActiveX를 호출하는 것이다. Java Web Start만을 자동 설치하는 OBJECT 테그 및 URL은

<OBJECT CODEBASE="http://java.sun.com/products/javawebstart/autodl/jinstall_javaws-1_2-windows-i586.cab"
CLASSID="clsid:5852F5ED-8BF4-11D4-A245-0080C6F74284" HEIGHT=0 WIDTH=0>
이며

자바 1.4 버전을 자동 설치할 수 있는 OBJECT 테그 및 URL은

<OBJECT CODEBASE="http://java.sun.com/products/plugin/autodl/jinstall-1_4_1-windows-i586.cab"
CLASSID="clsid:5852F5ED-8BF4-11D4-A245-0080C6F74284" HEIGHT=0 WIDTH=0>

이다.

이왕 자동 설치 기능을 사용하려면 web start만 설치하는 것 보다는 응용프로그램 환경이 더 좋은
자바 1.4버전을 설치하는 것이 좋다.

다음 예제는 위의 예제와 같이 기존에 web start설치 여부를 체크하고 없으면 자동설치 되도록 자바
스크립트 및 HTML을 작성 예를 보여주고 있다. ( Developer's Guide 참조 )

Simple2.html
=====================================================================================================



<SCRIPT LANGUAGE="JavaScript">
var javawsInstalled = 0;
var javaws12Installed = 0;

isIE = "false";

if (navigator.mimeTypes && navigator.mimeTypes.length) {
x = navigator.mimeTypes['application/x-java-jnlp-file'];
if (x) {
javawsInstalled = 1;
javaws12Installed=1;
}
} else {
isIE = "true";
}
</SCRIPT>



<SCRIPT LANGUAGE="VBScript">
on error resume next
If isIE = "true" Then
If Not(IsObject(CreateObject("JavaWebStart.isInstalled"))) Then
javawsInstalled = 0
Else
javawsInstalled = 1
End If
If Not(IsObject(CreateObject("JavaWebStart.isInstalled.2"))) Then
javaws12Installed = 0
Else
javaws12Installed = 1
End If
End If
</SCRIPT>


<SCRIPT LANGUAGE="JavaScript"
SRC="xbDetectBrowser.js">
</SCRIPT>



<SCRIPT LANGUAGE="JavaScript">
if (javawsInstalled || navigator.family == 'gecko') {
document.write("<a href=http://www.yyy.zzz/app.jnlp>프로그램 실행</a>");
} else {

document.write("<a href=http://dlres.java.sun.com/PluginBrowserCheck?");
document.write("pass=http://www.yyy.zzz/download.html&");
document.write("fail=http://java.sun.com/cgi-bin/javawebstart-platform.sh>");
document.write("설치 및 응용프로그램 실행</a>");

}
</SCRIPT>

=====================================================================================================

download.html
=====================================================================================================


<HTML>
<BODY>
<OBJECT CODEBASE="http://java.sun.com/products/javawebstart/autodl/jinstall_javaws-1_2-windows-i586.cab"
CLASSID="clsid:5852F5ED-8BF4-11D4-A245-0080C6F74284" HEIGHT=0 WIDTH=0>
<PARAM NAME="app" VALUE="http://www.yyy.zzz/app.jnlp">
<PARAM NAME="back" VALUE="true">

<A HREF="http://java.sun.com/cgi-bin/javawebstart-platform.sh?">
Download Java Web Start</A>
</OBJECT>
</BODY>
</HTML>
=====================================================================================================

이 방법은 결국 썬에서 제공하는 설치 ActiveX를 이용하는 것이고 그렇기 때문에 IE에서만 작동한다.
그리고 설치가 않될 경우는 플랫폼에 따라 설치 메뉴얼이 있는 페이지로 이동하게 된다. 이 방법을
잘 파악하려면 다음과 같이 썬에서 제공하는 각종 URL을 잘 이해하고 있어야 한다.

- http://dlres.java.sun.com/PluginBrowserCheck
pass와 fail파라미터가 있어 브라우져가 IE일 경우 pass에서 준 URL로 링크되고 아닐 경우는
fail에서 준 URL로 링크된다. 파라미터가 없을 경우 플랫폼에 따라 JRE를 설치할 수 있는
메뉴얼 페이지로 간다.

- http://java.sun.com/cgi-bin/javawebstart-platform.sh
플렛폼에 따라 자바 웹스타트를 설치할 수 있는 메뉴얼 페이지로 이동한다.

- <OBJECT CODEBASE="http://java.sun.com/products/javawebstart/autodl/jinstall_javaws-1_2-windows-i586.cab"
CLASSID="clsid:5852F5ED-8BF4-11D4-A245-0080C6F74284" HEIGHT=0 WIDTH=0>
자바 웹스타트를 자동 다운로드 및 설치하는 ActiveX를 구동한다. "app"파라미터는 설치후 자동 링크되는
URL 값이다.

- <OBJECT CODEBASE="http://java.sun.com/products/plugin/autodl/jinstall-1_4_1-windows-i586.cab"
CLASSID="clsid:5852F5ED-8BF4-11D4-A245-0080C6F74284" HEIGHT=0 WIDTH=0>
JRE1.4를 자동 다운로드 및 설치하는 ActiveX를 구동한다. 웹스타트는 자바 1.4버젼에 포함되어 있다.

자동 설치 및 실행의 세번째 방법은 간단하게 시작 버튼을 가진 Applet을 만들고 Applet실행을
JRE 1.4로 할 수 있도록 요구하는 것이다. 이렇게 하면 Applet이 실행되면서 JRE 설치 버젼을
체크해 자동으로 JRE1.4가 설치되며 자바웹스타트도 자동 설치된다.

Applet을 조금더 응용하면 Applet에 실행버튼을 만들어 웹스타트 응용 프로그램을 바로 실행할 수 있도록
하는 것이다. 다음은 그러한 버튼을을 갖은 예제 Applet 및 실행 HTML이다.

StartApplet.java
=====================================================================================
import java.awt.*;
import java.awt.event.*;
import java.net.*;

/**
* StartApplet
*
*/

public class StartApplet extends java.applet.Applet implements ActionListener {

// start button
Button btStart = new Button("응용프로그램 시작");

// url
String url;

//--------------------------------
// init
//--------------------------------
public void init() {

btStart.addActionListener(this);
add(btStart);

url = getParameter("url");

}

//----------------------------------------
// action performed
//----------------------------------------
public void actionPerformed(ActionEvent e) {

try {
getAppletContext().showDocument(new URL(url));
}catch(Exception ex) {
ex.printStackTrace();
}

}

}

==============================================================================

StartApplet.html
==============================================================================

<html>
<body>

<OBJECT name="StartApplet" classid="clsid:8AD9C840-044E-11D1-B3E9-00805F499D93" WIDTH=150 HEIGHT=30
codebase="http://java.sun.com/products/plugin/autodl/jinstall-1_4_0-win.cab#Version=1,4,0,0">
<PARAM NAME=CODE VALUE="StartApplet.class">
<PARAM NAME="type" VALUE="application/x-java-applet;version=1.4">
<PARAM NAME="url" VALUE="<a href=http://www.yyy.zzz/app.jnlp>">

<COMMENT>
<EMBED invokeURLs=false type="application/x-java-applet;jpi-version=1.4" CODE="StartApplet.class" WIDTH=150 HEIGHT=30
pluginspage="http://java.sun.com/products/plugin/index.html#download">
scriptable=false
<NOEMBED invokeURLs=false></NOEMBED invokeURLs=false>
</EMBED invokeURLs=false>
</COMMENT>
</OBJECT>

</body>
</html>

================================================================================

StartApplet은 url 파라미터를 받아들이고 시작 버튼을 누르면 브라우져의 링크를 url 파라미터가
지정한데로 가도록 되어 있다. 따라서 실행하고 싶은 JNLP 파일을 url 파라미터에 주면 웹스타트가
자동 실행된다.

StartApplet.html은 StartApplet 클레스를 자바 플러그인 형태로 실행 시키며, 자바 플러그인이
설치되어있지 않으면 자동 설치되도록 되어 있다. 위의 예제에서 <a href=http://www.yyy.zzz/app.jnlp> 부분만
원하는 URL로 바꾸어주면 자동 설치 및 웹스타트 시작페이지로 사용할 수 있을 것이다.

이 방법은 간단한 Applet이 필요하기는 하지만 가장 간단하고 쉬운 방법이 아닌가 생각한다.

VII. Jar 자원 활용 및 JNLP API 이용
-----------------------------------

자바웹스타트 응용 시스템을 만들다 보면 때로는 클레스와 함께 jar 화일에 묶여있는
이미지 화일이나 프로퍼티 화일, 자료 화일 등이 필요할 경우가 있다. 이러한 경우에는
java.lang.ClassLoader로 부터 해당 자원을 얻을 수 있다.

다음은 jar화일에 묶여있는 이미지 화일로부터 Icon 객체를 생성해내는 예제이다.

ClassLoader loader = this.getClass().getClassLoader();
Icon myIcon = new ImageIcon(loader.getResource("imgs/myImg.gif"));

이 밖에도 java.lang.ClassLoader의 getResourceAsStream(String name) 메소드를 이용하면
다양한 자원을 InputStream을 통하여 얻을 수 있다.

자바웹스타트 환경에서는 Java 2 SE API에서 제공하지 않는 추가적인 기능을 하는 API를
제공한다. 이것은 JNLP API라고 한다. JNLP API를 이용하여 개발할 경우는 jnlp.jar가
필요한데 이러한 파일은 JNLP Developer's Pack에 포함 되어 있다. 다음은 Developer's Pack을
다운로드 할 수 있는 URL이다.

http://java.sun.com/products/javawebstart/download-jnlp.html

JNLP API가 추가적으로 제공하는 클레스는 javax.jnlp package로 묶여 있으며
BasicService, ClipboardService, DownloadService, FileOpenService, FileSaveService,
PrintService, PersistenceService 등이 있는데 이들은 ServiceManager 클레스를 통하여
사용할 수 있다. 각각의 기능은 다음과 같다.

- javax.jnlp.BasicService

BasicService는 웹스타트의 환경적인 면이나 브라우져를 통제하기 위한 API를 제공하는데
자바 애플릿의 경우 AppletContext와 비슷한 역할을 한다. 다음 예제는 웹스타트 환경에서
웹브라우져로하여금 특정 URL로 가도록 하는 것이다.

import javax.jnlp.*;
.....

BasicService bs = (BasicService)ServiceManager.lookup("javax.jnlp.BasicService");
bs.showDocument(new URL("http://www.javanuri.com"));

- javax.jnlp.ClipboardService

ClipboardService는 시스템에서 사용하는 클립보드에서 복사 객체를 가져오거나 클립보드로
복사하는 서비스를 제공한다. 자바웹스타트는 이 기능을 사용할 때 보안을 위하여 경고창을
보여준다. 다음은 간단한 스트링을 클립보드에 복사하는 예제이다.

import javax.jnlp.*;
.............

ClipboardService cs = (ClipboardService)ServiceManager.lookup("javax.jnlp.ClipboardService");
StringSelection ss = new StringSelection("Hello Web Start");
cs.setContents(ss);

- javax.jnlp.DownloadService

DownloadService는 자신의 자원을 Cache에 저장, 삭제등 Cache를 통제할 수 있는 서비스 API를
제공하는 클레스이다. 다음은 myapp.jar를 Cache에서 확인하고 있으면 삭제한후 다시 Cache에
저장하는 예제이다.

import javax.jnlp.*;
...........

DownloadServicd ds = (DownloadService)ServiceManager.lookup("javax.jnlp.DownloadService");
URL url = new URL("http://www.javanuri.com/jws/myapp.jar");
boolean isCached = ds.isResourceCached(url, "1.0");
if(isCached) {
ds.removeResource(url, "1.0");
}

DownloadServiceListener dsl = ds.getDefaultProgressWindow();
ds.loadResource(url, "1.0", dsl);

- javax.jnlp.FileOpenService

FileOpenService는 권한이 제약된 환경에서도 이를 사용자에게 알리고 화일을 열 수 있는
다이얼로그 윈도우를 열어주는 서비스이다. 다음 예제는 FileOpenService를 이용하여 화일을
여는 예제이다.

import javax.jnlp.*;
..............

FileOpenService fo = (FileOpenService)ServiceManager.lookup("javax.jnlp.FileOpenService");
FileContents fc = fo.openFileDialog(null, null);

- javax.jnlp.FileSaveService

FileSaveService는 권한이 제약된 환경에서도 local disk에 화일을 저장할 수 있는
기능을 제공하는 서비스 클레스이다. 이는 FileOpenService의 경우와 반대인 기능을 제공하는
클레스이다. 다음은 FileOpenService를 이용하여 화일을 연 후에 FileSaveService를
이용하여 화일을 저장하는 예제이다.

import javax.jnlp.*;
.....................

FileOpenService fo = (FileOpenService)ServiceManager.lookup("javax.jnlp.FileOpenService");
FileContents fc = fo.openFileDialog(null, null);
FileContents newfc = fss.saveFileDialog(null, null, fc.getInputStream(), "newfile.txt");

- javax.jnlp.PrintService

PrintService는 권한이 제약된 웹스타트 환경에서도 프린트를 가능하게 해주는 API 를 갖고 있는
서비스 클레스이다. 이 API를 이용하여 프린트를 요청하면 사용자에게 허가할 것인가를 묻는
다이얼로그가 나타난다. 다음은 PrintService를 이용한 프린트 요청 예제이다.

import javax.jnlp.*;
.....................

PrintService ps = (PrintService)ServiceManager.lookup("javax.jnlp.PrintService");

// default page format
PageFormat pf = ps.getDefaultPage();

// customizing page format
PageFormat npf = ps.showPageFormatDialog(pf);

// print
ps.print(new Doc());

// printable class
class Doc implements Printable {

....
public int print(Graphics g, PageFormat fm, int idx) {
....
}

}

}

- javax.jnlp.PersistenceService

PersistenceService는 브라우져의 쿠키와 마찬가지고 사용자의 클라이언트에 간단한 자료를
저장할 때 사용된다. 저장되는 형태는 url형태로 자장된다.
다음은 간단한 url을 저장하고 내용을 읽어들이는 예제이다.

import javax.jnlp.*;
.....................

PersistenceService ps = (PersistenceService)ServiceManager.lookup("javax.jnlp.PersistenceService");

String addr = "www.javanuri.com/some.txt";
java.net.URL = new URL(addr);

// create
ps.create(url, 1024);
FileContents fc = ps.get(url);
OutputStream os = fc.getOutputStream(false);
os.write(...);

// read
fc = ps.get(url);
InputStream in = fc.getInputStream();

in.read(...);

.......

- javax.jnlp.FileContents

FileContents 는 FileOpenService, FileSaveService, PersistenceService와 같은 서비스에서
input과 output을 처리할 수 있도록 만들어진 클레스이다. 일반적인 File 클레스와 비슷하게
생각하면 된다. 보안과 자료 저장 형태 등이 일반 File 클레스와는 다르다.

출처 : http://www.javanuri.co.kr, 유용근

이 글에는 트랙백을 보낼 수 없습니다

Application_developing 2007/10/16 10:57

[MFC] header & library 설명

다음글은 하이텔 MFC소모임의 시삽이신 권정혁님의 글을 정혁님의 허락하에
다음과 같이 기재합니다. 허락해주신 정혁님께 감사드립니다.
****************************************************************************

이 글은 Mike Blaszczak 의 책 "Professional MFC" 의 Appendix D 에 들어있는
내용을 번역한 글입니다. 제 허락없이는 다른 어떤게시판에도 옮기실수 없습니
다. 물론 글의 원 소유자는 Mike 입니다.

MFC&T 사용자 모임 시삽 권정혁

============================================================================
MFC 헤더와 라이브러리 설명 (The Foundation Classes Headers and Libraries) #2

2. 런타임 라이브러리 (Run-time Libraries)

이 테이블은 Visual C++ 과 함께 제공되는 라이브러리와 Pre-Compiled Object 들의
기능에 대한 짧은 설명을 보여줍니다.

------------------------------------------------------------------------------
런타임 라이브러리 (Run-time Libraries)
------------------------------------------------------------------------------

파일설명

Advapi32.lib 레지스트리나 보안 관련 API 같은 진보된 API 서비스 들에 대한
임포트 라이브러리 입니다. 이 임포트 라이브러리와 링크하는 것은
당신의 프로그램에서 Advapi32.dll에 포함된 함수들을 사용가능하게
합니다.

Atl.lib MS의 Active Template Library에 대한 라이브러리입니다.

Binmode.obj 이 모듈과 링크하면 C 런타임 라이브러리에 의해 오픈되는 화일들이
기본적으로 Binary 모드로 열리도록 만듭니다.

Cap.lib Call Attributed Profiler 에 대한 인터페이스 입니다.
이 툴은 함수 호출 패턴(Function Call Patterns)을 분석함으로써
Win32 어플리케이션을 튜닝하게 해줍니다.

Chkstk.obj 런타임용 Stak-Depth 체킹 모듈입니다.
이 모듈은 모든 함수 호출전에 스택의 크기(Depth)를 체크함으로써
Stack Overflow 가 났는지 체크하는 것을 도와줍니다.
Windows NT 에서는 프로그램의 스택 세그먼트를 조심스레 체크하며,
스택 오버플로우가 난다면 프로그램을 깨끗히 종료시켜주므로
이 파일은 거의 필요가 없습니다.

Comctl32.lib 윈도우 공용 컨트롤(Windows common controls)에 대한 라이브러리
입니다.

Comdlg32.lib 윈도우 공용 대화상자(Windows common dialogs)에 대한 라이브러리
입니다. 이 라이브러리는 표준 화일오픈,화일저장,폰트선택,프린트
,컬러 선택 다이얼로그에 대한 인터페이스를 제공합니다.

Commode.obj 전역적인 File Commit Flag 값을 Commit 으로 설정합니다.
이 파일과 링크하는것은 오픈되는 모든화일이 기본적으로 Commit
모드에서 열리도록 합니다.

Compmgr.lib 컴포넌트에대한 임포트 라이브러리입니다.

Ctl3d32.lib 3차원(Three-D) 컨트롤에대한 지원 라이브러리입니다.
이 라이브러리는 다이알로그와 컨트롤이 3차원으로 보이도록 합니다.
이 파일은 거의 안쓰입니다. 단지 이전버젼과의 호환을 위해
제공됩니다.

Ctl3d32s.lib Win32s 용 프로그램에대한 Ctl3d 라이브러리 입니다.

D3drm.lib Direct3D 렌더링 모델(Rendering Model) API에 대한
라이브러리입니다.

Daouuid.lib DAO 객체들에 대한 UUID 값을 가지고 있는 라이브러리입니다.

Ddraw.lib DirectDraw API용 라이브러리입니다.

Dflayout.lib Compound Document File(복합문서파일)에 대한 저장소 관리
(Storage Management)를 하는 OLE 함수들에 대한 임포트
라이브러리입니다.

Dlcapi.lib DLC 3270 연결을 위한 라이브러리 입니다.

Dplay.lib DirectPlay API용 라이브러리 입니다.

Dsound.lib DirectSound API용 라이브러리입니다.

Fp10.obj 이 모듈과 링크하는것은 프로그램이 기본적으로 53비트 대신 64비트
부동소수점(Floating-Point Precision) 형식으로 알고리즘을
사용하게 합니다.

Gdi32.lib Windows GDI 임포트 라이브러리 입니다. 이 라이브러리와 링크하는
것은 프로그램이 Windows 의 Graphic Device Interface 에 있는
SelectObject(),CreateFont(),LineTo() 와 같은 루틴을 호출하여
디스플레이나 프린터에 그리기를 수행할수 있도록 합니다.

Glaux.lib OpenGL 보조 함수 라이브러리 입니다. 거의 모든 프로그램에서
사용되지 않으나, 이것은 OpenGL의 핵심(Core) 라이브러리의 기능을
향상시키도록 합니다. Opengl32.lib도 참조하십시오.

Glu32.lib OpenGL 그래픽의 핵심 함수 라이브러리 입니다.
Opengl32.lib도 참조하십시오.

Hlink.lib IHlink 와 관련 인터페이스 지원용 라이브러리 입니다.
이 인터페이스들은 ActiveX 객체가 일반적인 하이퍼링크 스타일
(Hyperlink-Style)의 이동을 구현하는것을 도와줍니다.

Imagehlp.lib 디버거 같은 시스템 툴들이 다른 프로세스를 관리하고 디버그
정보를 추출하도록 하는 루틴이 들어있는 라이브러리 입니다.

Imm32.lib Input Method Editor(IME)의 사용에 라이브러리입니다.
IME는 조그만 팝업윈도우창으로 보이며,다른나라 글자를 선택하도록
해줍니다. (한글윈도우 사용자는 다들 아시죠..? )

Kernel32.lib Windows Kernel 의 임포트 라이브러리입니다. 이 라이브러리와
링크함으로써 Windows kernel 안에 들어있는 루틴의 호출이 가능해
집니다. Windows Kernel 에는 CreateSemaphore() 나 GlobalAlloc()
같은 함수들이 포함되어 있습니다.

Largeint.lib 수학계산용 Large Interger지원 라이브러리입니다. 이 라이브러리는
단지 호환목적으로 제공됩니다. Visual C++ 의 컴파일러는
기본적으로 64Bit Integer 를 지원합니다.

Libc.lib 표준 C Runtime 라이브러리입니다. 이 라이브러리는 sprintf() 나
strcpy() 같은 함수들이 포함되며,프로그램에 정적으로 링크됩니다.
이것은 멀티쓰레드나 재진입(re-entrant) 프로그램에는 안전하지
못합니다.

Libcd.lib 표준 C Runtime 라이브러리의 Debug 버젼입니다. Debug 빌드에선
Libc.lib 대신 이것을 사용합니다. 이것은 프로그램에 정적으로
링크됩니다.

Libci.lib 표준 C 라이브러리입니다.이 라이브러리는 Libc.lib 와 비슷하지만,
이것은 이전 버젼의 Visual C++ 에서 제공된 표준 라이브러리에서
지원된 표준 iostream 라이브러리와의 호환성을 제공합니다.
(iostream이 구형이라는 얘기죠..) 이것은 프로그램에 정적으로
링크됩니다.

Libcid.lib 구형 iostream 버젼이 포함된 표준 C 라이브러리의 Debug버젼입니다.
역시 프로그램에 정적으로 링크됩니다.

Libcimt.lib 구형 iostream 버젼이 포함된 표준 C 라이브러리의 멀티쓰레드에
안전한 버젼입니다. 역시 프로그램에 정적으로 링크됩니다.

Libcimtd.lib 구형 iostream 버젼이 포함된 표준 C 라이브러리의 멀티쓰레드버젼의
Debug 빌드입니다. 역시 프로그램에 정적으로 링크됩니다.

Libcmt.lib 멀티쓰레드 프로그램이나 재진입 프로그램에서도 사용이 안전한
sprintf() 나 strcpy() 같은 함수들이 포함된 C 런타임 라이브러리의
멀티쓰레드 버젼입니다. 역시 프로그램에 정적으로 링크됩니다.

Libcmtd.lib 바로위의 Libcmt.lib의 디버그 버젼입니다.
역시 프로그램에 정적으로 링크됩니다.

Libcp.lib 표준 C++ 런타임 라이브러리 입니다. 이것은 호출하는 프로그램에
정적으로 링크되며, 멀티쓰레드나 재진입 프로그램에 안전하지
못합니다.

Libcpd.lib 표준 C++ 런타임 라이브러리의 Debug 버젼입니다.
역시 프로그램에 정적으로 링크됩니다.

Limcpmt.lib 표준 C++ 런타임 라이브러리의 멀티쓰레드 버젼입니다.
역시 프로그램에 정적으로 링크됩니다.

Libcpmtd.lib 멀티쓰레드 C++ 표준 라이브러리의 디버그 버젼입니다.
역시 프로그램에 정적으로 링크됩니다.

Loadperf.lib 이 임포트 라이브러리는 Performance Counter 에 관련된 레지스트리
값들의 초기화를 지원하는 루틴들을 포함합니다.
이 라이브러리는 보통 인스톨 프로그램들에 사용됩니다.

Lz32.lib Lempel-Ziv압축해제 루틴 라이브러리입니다. 보통 인스톨 프로그램에
의해서 사용됩니다. 이 라이브러리는 압축루틴이 없으며,
단지 압축해제루틴만 들어있습니다.

Mapi32.lib Microsoft Mail API 라이브러리 입니다..

Mfcapwz.lib Custom Wizard의 개발을 가능하게 해주는 클래스와 함수들이 포함된
라이브러리입니다.

Mfcclwz.lib Custom Wizard의 개발을 가능하게 해주는 클래스와 함수들이 포함된
라이브러리입니다.

Mfcuia32.lib OLE 공통 사용자 인터페이스(Common User Interface)에 대한 MFC 의
구현부분이 들어있습니다. Oledlg.lib 와 비슷하지만 Unicode 대신
ANSI 인터페이스를 지원합니다.

Mgmtapi.lib SNMP(Simple Network Management Protocol) Management API입니다.

Mpr.lib 연결관리(Connection Management)를 위한 LAN Manager 스타일의
네트웍 API 가 들어있습니다. 이 API 들은 Windows 상에서 Connect와
DIsconnet 를 가능하게 합니다.

Msacm32.lib Microsoft Audio Compression Manager API(오디오 압축 관리자 API)
입니다. 이것은 Audio Waveform 데이타를 압축하고 해제하는
유틸리티들입니다.

Msconf.lib Microsoft ActiveX Conferencing API 에 대한 라이브러리입니다.

Mslsp32.lib License Service API 에 대한 임포트 라이브러리입니다.

Msvcirt.lib 구형 iostream 버젼이 포함된 표준 C 라이브러리의 DLL Build에 대한
임포트 라이브러리 입니다. 이 라이브러리는 Libci.lib 라이브러리의
DLL 버젼에 대한 임포트 라이브러리입니다.

Msvcirtd.lib Mscvirt.lib 라이브러리의 디버그 버젼에 대한 임포트 라이브러리
입니다.

Msvcprt.lib 표준 C++ 라이브러리의 DLL Build 에 대한 임포트 라이브러리입니다.
이것은 Libcp.lib 라이브러리의 DLL 버젼에 대한 임포트 라이브러리
입니다.

Msvcprtd.lib Msvcprt.lib 라이브러리의 디버그 버젼에 대한 임포트 라이브러리
입니다.

Msvcrt.lib 표준 C 라이브러리의 DLL Build 에 대한 임포트 라이브러리입니다.
이것은 Libc.lib 라이브러리의 DLL 버젼에 대한 임포트 라이브러리
입니다.

Msvcrtd.lib Msvcrt.lib 라이브러리의 디버그 버젼에 대한 임포트 라이브러리
입니다. Retail Build 에서 Msvcrt.lib 를 사용한다면 Debug Build
에서 이것을 사용하십시오.

Mswsock.lib Windows Sockets 2 API 에 대한 Microsoft 의 특정 확장부분
(MS-Specific Extension)에 대한 임포트 라이브러리입니다.

Mtx.lib Microsoft Transaction Server(MTS) 프로그래밍 인터페이스
라이브러리입니다.

Mtxguid.lib MTS 에 의해 지원되는 객체들의 Guid 들을 가지고 있는
라이브러리입니다.

Nddeapi.lib Network DDE API 입니다. DDE 스타일의 서비스를 네트웍을 통하여
시스템간에 사용가능하도록 해줍니다.

Netapi32.lib LAN Manager API Interface 입니다. 이 라이브러리는 MS 의 NOS
(Network Operation System)에 의해 제공되는 저수준의 기능들을
사용하도록 해주는 함수들을 포함하고 있습니다.

Newmode.obj 당신의 Application 이 malloc() 함수 호출에 실패했을때 new
연산자의 에러 처리 메커니즘을 사용하도록 하여줍니다. 기본적으로,
이것이 동작하지는 않습니다: malloc() 이 실패한다면, NULL 을
리턴하지 예외를 발생시키지(Throw) 않습니다.
이 모듈과 링크하는것은 malloc() 호출 실패시 new 연산자의 에러
처리루틴을 호출하는 것으로 C 런타임 라이브러리의 동작을
변경합니다.

Ocx96.lib OCX 96 스펙(Specification)에 명시된 인터페이스들에 대한
UUID들이 포함된 라이브러리입니다.

Odbc32.lib ODBC API 라이브러리입니다. 이 라이브러리는 데이터베이스
응용프로그램에 대한 하부 독립적인 API 들을 제공합니다.
이 라이브러리는 MFC 에 의해 다시 추상화 됩니다.

Odbccp32.lib ODBC control panel applet(제어판에 등록되는 응용프로그램) 에
관한 인터페이스가 포함된 라이브러리입니다.

Oldnames.lib "Kernighan and Ritchie C" 와 호환되는 이름을 가진
표준 C 런타임 라이브러리 함수들입니다. 이 라이브러리는 K&R-C의
execv() 같은것을 ANSI-C 의 같은 함수인 _execv() 에 매핑(Mapping)
해줍니다.

Ole32.lib 32-bit OLE 지원을 위한 Core 라이브러리입니다.

Oleaut32.lib 32-bit Automation interface에 대한 라이브러리입니다.

Oledlg.lib OLE 공통 사용자 인터페이스(Common User Interface)에 대한 System
Implementation입니다. OleUiEditLinks() 나 OleInsertObject()와
같은 함수들을 포함합니다.

Olepro32.lib OLE Property Frame API 입니다. 또한 OLE Font (IFont) 나 Picture
(IPicture) 프로퍼티 들에 대한 구현도 포함하고 있습니다.

Opengl32.lib OpenGL 의 Core 함수 라이브러리입니다.
OpenGL 은 Silicon Graphics에 의해 정의되고 Microsoft 에 의해
Win32 용으로 구현된 Graphic Rendering Language 입니다.
Glu32.lib 와 Glaux.lib 도 참조하십시오.

Pdh.lib Performance Data 헬퍼함수들에 대한 임포트 라이브러리 입니다.
이 Win32 API의 부분들은 프로세스에대해 Performance Counter 들을
작성하고,질의하고,갱신하는것을(Create,Query,Update) 도와주는
쉬운 인터페이스들을 포함하고 있습니다.

Penwin32.lib Pen Computing 용 Windows에 대한 확장 라이브러리입니다.

Pkpd32.lib Pen Windows 의 커널함수들입니다.

Rasapi32.lib 클라이언트용 Remote Access Service(RAS) API 입니다.
이 라이브러리에 있는 함수들은 모뎀이나 그 비슷한 저속회선 연결을
통해 원격지 컴퓨터에 연결하도록 해줍니다.

Rasdlg.lib RAS 응용프로그램에 대한 Common User Interface 요소들을 포함하고
있는 라이브러리입니다.

Rassapi.lib RAS 서버쪽 API 들입니다.

Rpcndr.lib Remote Procedure Call(RPC) Helper Function API 들입니다.

Rpcns4.lib RPC Name Service 함수들 입니다.

Rpcrt4.lib RPC Windows run-time 함수들 입니다.

Scrnsave.lib 화면보호기(Screen saver) 에 대한 인터페이스입니다.

Setargv.obj 이 모듈과 링크하는 것은 콘솔 프로그램이 Wildcard(*,?) 를 사용한
Command Line Parameter들을 실제 파일이름들로 지정하도록 확장하여
줍니다. 각 파일은 main() 의 argv 인자에 들어가게 됩니다.
윈도우에서 사용하려면 Wsetargv.obj 를 살펴보십시오.

Setupapi.lib 파일 설치와 삭제에 관련된 함수들입니다. 셋업프로그램에서
사용됩니다.

Shell32.lib Windows Interface Shell API 들입니다. 이 API 들은 예를 들어
Norton Desktop for Windows와 같은 프로그램에 사용된 실행파일에서
아이콘을 추출하거나 Command Line Parameter 를 사용해 다른
프로그램을 실행하는등의 기능을 제공합니다.

Snmpapi.lib Simple Network Managerment Protocol(SNMP)에 관련된 주 API 함수들
입니다. TCP/IP 네트웍에 대해 이 프로토콜은 Gateway 나 연결될
네트웍들을 모니터링하는데 사용됩니다. Mgmtapi.lib 와 연관되어
있습니다.

Svrapi.lib Inter-Server Communication 에 관한 Network API 들입니다.

Tapi32.lib Microsoft Telephony API(TAPI) 라이브러리입니다. lineOpen() 과
같은 telephony API 들을 구현합니다.

Th32.lib 32-bit Toolhelp 라이브러리입니다. 이 라이브러리는 Debugger 나
저수준의 툴을 작성할때 도움을 주는 함수들을 제공합니다.
예를들어 이 라이브러리의 루틴들을 윈도우상에서 프로세스나
쓰레드들을 Enumerate 하게 해줍니다.

Thunk32.lib Thunk 컴파일러의 런타임 지원 루틴들입니다.
(Thunking 이 뭔지 아시죠?)

Url.lib 이 파일은 URL 을 Parsing 하거나 MIME 헤더를 해석하는데 사용하는
루틴입니다. 이 라이브러리에 있는 함수들은 현재 문서화가 되어있지
않으며, 이것은 Win32 SDK 의 이후 버젼이나 Visual C++ 의 이후버젼
에서 더 정제되고, 무리없게 지원될것입니다.

Urlmon.lib URL 모니커(moniker)의 런타임 지원 라이브러리에 대한 임포트
라이브러리입니다.

User32.lib 윈도우즈의 USER.EXE 에 대한 임포트 라이브러리 입니다.
이 라이브러리와 링크하는 것은 프로그램이 Windows 의 UI 부분을
사용할수 있도록 해줍니다. 이 라이브러리는 CreateDialog() 나
CreateWindow() 같은 함수를 포함합니다.

Uuid.lib Stock(내장된) OLE 객체에 대한 표준 UUID 들입니다.

Vdmdbg.lib 이 라이브러리에 있는 함수들은 NT VDM 안에서 디버깅 하는것에
관련된 기능들을 지원합니다.

Version.lib GetFileVersion() 과 같은 버젼 확인 API 들입니다.

Vfw32.lib Video for Windows API 들입니다. 이 라이브러리에 있는 함수들은
Multimedia 비디오와 오디오를 재생,녹음,수정,저장 하는것을
가능하게 합니다.

Webpost.lib WebPost API 임포트 라이브러리입니다. 이 라이브러리는 ISP
(internet Service Provider)에 의해 제공되는 웹 사이트에 사용자의
컴퓨터에서 데이타를 올리는것이 가능하도록 도와줍니다.

Win32spl.lib Win32 spooler API 입니다. 이 파일에 있는 루틴들은 다른 프로그램
이나 컴퓨터들로부터 Print Spooler Status에 접근하는것을 가능하게
해줍니다.

Wininet.lib Windows Internet Client API 들입니다.

Winmm.lib Windows Multimedia API 들입니다. Multimedia Device Management,
Timer, Wave 파일, Multimedia I/O 제어함수 같은 것을 포함합니다.

Winspool.lib The Win32 spooler API 입니다. 이 라이브러에 보이는 루틴들을
프로그램이 프린트 하는 동안 Windows Print Spooler 의 기능을
사용하도록 해줍니다.

Winstrm.lib Windows NT의 TCP/IP interface들 입니다. 이 파일은 TCP/IP Routing
함수같은 것들을 제공합니다.

Wintrust.lib ActiveX 객체에 대한 Trust Verification(신용확인) 에 관한 API들의
임포트 라이브러리입니다. 이것은 WinVerifyTrust() 같은 함수들을
사용가능하도록 합니다.

Wow32.lib 이 라이브러리는 16-Bit 와 32-Bit 객체간에 핸들을 변형하도록 하는
Generic Thunking 메커니즘에 의해 사용됩니다. 이 라이브러리는
또한 16 Bit 프로세스에서 32-Bit 메모리를 관리하도록 하는것을
도와줍니다.

Ws2_32.lib Windows Sockets 2.0 API 입니다.

Wsetargv.obj 이 모듈과 링크하는 것은 Windows 프로그램이 Wildcard(*,?) 를
사용한 Command Line Parameter 들을 실제 파일이름들로 지정하도록
확장하여 줍니다. 각 파일은 main()의 argv 인자에 들어가게 됩니다.
콘솔 프로그램에 사용하려면 setargv.obj 를 살펴보십시오.

Wsock32.lib Windows Sockets APIs.

Wsock32.lib Windows Sockets API 입니다.

Wst.lib Working Set Tuner DLL 에 대한 인터페이스 입니다.
Working Set Tuner DLL 은 프로그램을 조사하여 프로그램의
Working Set 을 최소화하도록 도와줍니다.

=============================================================================

참고할점은 주 API 들과, 헤더 파일 그리고 라이브러리들의 요약이 Lib 디렉토리에
있는 Win32api.csv 에 들어있다는 것입니다. 이 파일은 Comma Separated Variable
File(*.csv) 이며, 엑셀같은 스프레드쉬트 프로그램에서 읽어들일수 있습니다.

이 글에는 트랙백을 보낼 수 없습니다

Application_developing 2007/10/16 10:51

RPC(Remote Procedure Call)

<<< RPC(Remote Procedure Call)의 역사 >>>

1980년대에 여러 회사들에 의하여 다양한 RPC가 구현되었다. 그들 중에는 SUN 마이크로 시스템즈의 오픈 네트워크 컴퓨팅(ONC)과 Apollo 컴퓨터의 네트워크 컴퓨팅 구조(NCA)가 있었다. 오늘날, 휴렛 팩커드의 HP-UX, IBM의 AIX, SUN 마이크로 시스템즈의 SUN OS 4.1.X, 그리고 산타크루즈 오퍼레이션의 SCO UNIX와
같은 대부분의 상업용 UNIX 시스템들은 모두 ONC 기법을 기반으로 RPC를 구현 하였다.

그러나, SUN의 Solaris 2.x 운영체제와 UNIX 시스템 V.4는 수정된 ONC 기법을 기반으로 RPC를 구현하였다. 두 기법은 매우 비슷하다. 즉, 그들은 모두 네트워크를 통하여 데이터를 전송하기 위한 외부 데이터 표현(XDR) 형식을 사용하고, RPC 응용들의 생성을 간단히 하기 위하여 rpcgen 컴파일러를 제공한다. 그러나 두 기법은 ONC 기반 RPC API들이 소켓에 기초를 하고, 반면에 시스템 V.4 RPC API들은 소켓이나 TLI를 기초로 할 수 있다는 점에서 차이가 있다.

<<< RPC 프로그래밍 인터페이스 계층 >>>

RPC 프로그래밍 인터페이스는 다양한 계층이 있다. 그들의 범위는 C 라이브러리 함수들(예를 들면, printf)을 호출하는 것과 같은 방법으로 사용자들이 시스템이 제공하는 RPC 함수들을 호출하는 최상위 계층으로부터, RPC API들을 사용하여 사용자들이 RPC 프로그램을 생성하는 하위 계층까지 있다. 이 다양한 프로그래밍 인터페이스 계층들은 이 장의 나머지 부분에서 자세히 설명한다.

최상위 계층에는 원격 시스템의 정보를 수집하기 위하여 사용자들이 직접호출할 수 있는 시스템이 제공하는 RPC 함수들이 있다. 이 함수들은 일반적으로 C 라이브러리 함수처럼 사용될 수 있다. 단지 그들을 사용하기 위해서는 특별한 설정이 필요하다. 즉, (1) 함수의 원형들을 선언하는 특정한 헤더 파일들과 (2) 컴파일된 프로그램은 -lrpcsvc 스위치와 함께 링크되어야 한다. librpcsvc.a 라이브러리는 이 RPC
라이브러리 함수의 목적코드들을 포함한다.

RPC 라이브러리 함수들의 장점은 쉽게 사용할 수 있고 프로그래밍 부담이 적다는 것이다. 그러나, 시스템에서 정의된 RPC 라이브러리 함수들은 많지 않다. 그러므로, 이 함수들에 대한 응용은 제한이 있다.
RPC 프로그래밍의 두 번째 계층은 RPC 클라이언트와 서버의 스터브(stub) 루틴을 자동적으로 생성하기 위하여 rpcgen 컴파일러를 사용하는 것이다. 사용자들은 단지 클라이언트와 서버 프로그램을 생성하기 위한 클라이언트의 main 함수와 서버의 RPC 함수들만 작성한다. 또한 rpcgen 컴파일러는 클라이언트와 서버 사이에 데이터를 전송하기 위하여사용자가 정의한 어떠한 데이터 유형이든지 XDR 형식으로 변환하는 XDR 함수들을 생성할 수 있다. rpcgen 컴파일러를 사용하여 얻는 장점은 사용자들이 RPC 함수들과 클라이언트의 main함수를 작성하는데 주력할 수 있다는 것이다. 즉, 하위 계층의 RPC API들에 대해 알 필요가 없다. 이는 프로그래밍 노력을 절약하고 오류 발생 가능성을 줄이게 한다. 그러나, 이러한 접근 방식의 결점은 rpcgen이 생성한 클라이언트 서버프로그램에 의하여 사용되는 네트워크 트랜스포트의 어떤 자세한 속성들에 대하여 제어하기 어렵다는 것이다. 또한 이러한 클라이언트 서버는 XDR 함수들에 의하여 사용되는 동적 메모리를 관리할 수 없다.
RPC 프로그래밍 인터페이스의 최하위 계층은 RPC 클라이언트와 서버 프로그램들을 생성하기 위하여 RPC API들을 생성하는 것이다. 이것의 장점은 사용자들이 프로세스에 의하여 사용되는 네트워크 트랜스포트와 XDR 함수에 있는 동적 메모리 관리를 직접 제어할 수 있다는 것이다. 그러나, 이는 사용자 부분에서의 더 많은 프로그래밍 노력이 필요하게 된다

출처 : Tong - forestcamp님의 [솔라리스]통

이 글에는 트랙백을 보낼 수 없습니다

Application_developing/C 2007/08/09 01:20

초보자들이 반드시 지켜야 할 C언어 코딩 기법 c/c++

먼저 이 글을 읽고 숙지하기에 앞서 이 글은
순전히 C언어 프로그래밍 초보자들을 위한 코딩 기법을 설명한 것임을
알려드립니다. 자신이 초보자가 아니면 그냥 넘어가시면 되겠습니다.
그러나 제가 만나 본 대부분의 프로그래머들(중수 이상 역시 포함)은
이런 규칙을 전혀 지키지 않고 코딩하더군요..
제가 쓴 글을 읽고 코딩한다면 누가 보더라도 정갈하고 깔끔한
코드가 될 것입니다. 나중에 스스로 짠 코드가 뭔 소린지 몰라
한참을 헤메지 않도록 열심히 노력합시다.
그럼 본론으로 들어갑니다..

1. 변수의 이름

보통의 경우 변수의 이름은 아무 뜻도 없거나 무진장 줄여 자신이 짠
코드도 나중에 보면 무슨 소린지 무슨 내용인지 전혀 알아 보지 못하는
경우가 있습니다. 이럴 때 변수의 이름만이라도 변수의 용도에 맞춰서
쓴다면 그 얼마나 좋겠습니까..

변수의 이름을 정할 때는 반드시 헝가리안 표기법을 사용합니다.

코드:

int nCounter=0;

이런식으로 변수의 이름 앞에 변수의 종류를 쓰는 것입니다.

보통 숫자는 number의 앞 글자인 n, boolean 변수는 b, 포인터 변수는 ptr,
문자형 변수는 c, 문자열 변수는 str 등 변수의 이름 앞에 특징을 기술하는 것입니다.

아래의 예를 보시기 바랍니다.

코드:

nSum=0;
nGrandTotal=0;
cAlphabet=NULL;
strSum[6]={0,};
ptrNextChain=NULL;

sum=0;
grandTotal=0;
alphabet=NULL;
sum[6]={0,};
nextChain=NULL;

위와 아래는 보기에도 차이가 납니다.
특히 alphabet 변수와 nextChain 변수는 아래의 예만 봤을 때 무슨
변수인지 전혀 알아 볼 수 없습니다. 그러나 위의 예를 보게 되면
최소한 알파벳 변수가 문자형 변수이고 넥스트체인 변수가 포인터라는
것은 변수의 이름을 통해 유추할 수 있습니다.

2. 변수의 초기화
변수의 초기화는 대단히 중요하면서도 간과하기 쉬운 예입니다.
변수는 로컬 변수, 글로벌 변수를 불문하고 무조건 변수 선언할 때
초기화를 해줍니다. 보통의 C 컴파일러는 변수를 초기화하지 않고
그냥 사용하는 경우(즉, 초기화 없이 변수의 값을 읽는 경우) 워닝
메시지를 뿌리거나 메시지 없이 계속 컴파일합니다. 이것으로 생기는
side-effect(부작용)은 컴파일러도 책임지지 않습니다.
프로그램 코드가 길어지면 길어질수록 이런 변수 초기화를 하지 않아
생기는 문제는 더더욱 찾기 어려워집니다.

3. 변수의 용도 또는 사용 목적 기술
보통 위의 방법을 지켜 선언하더라도 변수를 선언할 때는 아래와 같이 하는
프로그래머들이 종종 있습니다.

코드:

int nCounter=0, nSum=0, nMaxBound=0;

나중에 코드를 읽는데 보기에 대단히 좋지 않은 방법입니다.
헝가리안 표기법으로 변수의 이름을 설정했지만 사실상 대략적인
의미만 파악할 수 있을 뿐 그 정확한 용도를 알아보기 힘듭니다.
따라서 아래와 같이 코딩합니다.

코드:

int nCounter=0; /* for문의 카운터로 사용하기 위해 선언 */
int nSum=0; /* 성적의 합을 저장하기 위해 선언*/
int nMaxBound=0; /* 그래프의 최대 한계값을 저장하기 위해 선언 */

자, 어떻습니까? 변수를 여러줄에 나눠서 선언한다고 해서 컴파일
시간이 비약적으로 늘어나는 것도 아니고 주석을 많이 단다고 해서
지저분한 프로그램 코드가 되는 것도 아닙니다. 위와 같이 코드를
짜놓으니 변수만 보고서도 무엇하는 프로그램인지 대충의 감을 잡게 됩니다.
프로그램 코드의 분석은 변수의 용도를 파악하기만 해도 절반은 된 것입니다.

잔소리. 저는 최대한 표준 C언어에 가깝게 코딩합니다. 보통의
컴파일러는 표준을 어기더라도 약간 느슨하게 허용하는 경우가 있으나
컴파일러간의 이식률을 높이기 위해 반드시 표준을 지켜서 코딩합시다.

4. 줄 맞추기
이건 잔소리 안할래야 안할 수 없습니다. 제가 대학시절 수업을
들을 당시 C언어를 강의한다는 강사라는 사람도 줄맞추기를
대충하더군요.. 줄맞추기는 칼 같이 해야합니다. 이건 권고사항이
아니라 반드시 지켜야할 강제사항입니다. 줄 맞추기 위해서
스페이스를 두 번 뚜드리던 탭을 한번 하던 그것은 여러분들의
자유지만 어떤 방법을 쓰더라도 줄 맞추기는 통일성 있게
칼 같이 해줘야 합니다. 어떨 때는 스페이스 2번으로 했다가
어떨 때는 탭으로 했다가 하는 짓은 줄 맞추기를 아니한만 못합니다.
코드를 분석해야하는 사람의 입장에서는 굉장히 괴로운 일입니다.
결국에는 눈물을 머금고 코드 줄 맞추기를 한 뒤에 그제서야
코드를 읽기 시작합니다. 무심코 빼먹은 줄 맞추기가 여러 사람 괴롭힙니다.

5. 중괄호 쓰기
코드를 아름답게 하는 방법중에는 중괄호를 많이 쓰는 방법도 있습니다.
중괄호를 많이 쓰면 그만큼 코드를 읽기 편해집니다. 영역이 명시적으로
표시되기 때문이지요.. 예를 볼까요?

코드:

while(i>0) if(i>0) printf("안녕?"); else i--;

while(nCounter>0)
{
if(nCounter>0)
{
printf("안녕?");
}
else
{
nCounter--;
}
}

극단적인 예이기는 하지만 저런 식으로 코딩하는 양반들도
종종 있습니다. 프로그래머라고 할 수도 없는 사람들이지만 말입니다.
아래의 예를 보면 어디서부터 어디까지 무슨 기능을 하는지 명시적으로
표현되어 있습니다. 단 한줄의 코드라도 for, do ~ while, switch ~ case, if ~ else 등
중괄호를 사용할 수 있으면 하라는 것입니다.
사실 중괄호는 아무데나 사용할 수 있습니다만 보통은 변수의 범위를
적절하게 설정하기 위해 코드 중간에 다음과 같이 사용하기도 합니다.

코드:

...
{
int nCounter=0;
for(nCounter=10; nCounter > 0; nCounter--)
{
...
}
}
...

6. Comment(주석)의 사용
누가 Comment를 주석이라고 이름 붙였는지 모르지만 주석이라는
단어의 사전적인 의미를 모르겠군요.. 그냥 주석은 이런저런 것이다는
것을 경험에 비추어 알고 있지만 말입니다. (일본식 단어인 것 같은
느낌을 팍팍 주는군요..) 저는 코멘트라고 하겠습니다.

용어야 어쨌거나 말았거나 코멘트는 많으면 많을 수록 좋다는 것이
제 지론입니다. 코멘트가 많으면 일단 프로그램을 읽는데 훨씬 수월해집니다.
긴 말 필요 없이 예를 보도록 하겠습니다. 아주 나쁜 예입니다.

코드:

int main(void)
{
int n,m,i,j,x,y,z,dae,so;

while(1)
{
printf("두수를 입력하시오 : ");
scanf("%d,%d",&n,&m);

if(n==0 || m==0)
{
break;
}

x=n;
y=m;

for(;;)
{
z=x%y;
if(z==0)
{
break;
}
else
{
x=y,y=z;
}
}
so=y;

printf("%d\t,",so);

dae=n*m/so;

printf("%d\n",dae);
}
return 0;
}

아.. 줄 맞춤은 제대로 되었지만 언뜻 봐서는 이거 뭐하는 프로그램인지
도무지 알 수가 없습니다.. 5초 안에 알아보실 수 있겠습니까?
변수 이름도 엉망진창이고 주석도 하나 없고 도무지 알아 먹을 수 있는
코드가 아닙니다. 만약 이 코드를 5초안에 알아 볼 수 있다면
이 프로그램을 1시간 전에 짰거나 이 알고리즘을 자다 일어나서도 외울 수
있을 정도인 사람이거나 당신은 정말 코드 분석의 천재입니다.
훌륭한 예를 보겠습니다.

코드:

/*************************************************************
Promgrammed by Chronoshift at 2003.10.12
If you want to contact me, send email to taktaktak@aheheh.com

아규먼트
없음

부작용
없음

프로그램의 내용
두 수를 입력 받고 입력 받은 두 수를 가지고
먼저 최대공약수를 계산하고 최대공약수와
입력받은 두 수를 가지고 최소공배수를 계산한다.
*************************************************************/

int main(void)
{
int nInputNumber1 = 0; /* 입력 받은 숫자를 저장하는 변수 */
int nInputNumber2 = 0; /* 입력 받은 숫자를 저장하는 변수 */
int nNumberBuffer1 = 0; /* 공약수 계산을 위한 임시 저장소 */
int nNumberBuffer2 = 0; /* 공약수 계산을 위한 임시 저장소 */
int nBalance = 0; /* 나머지 값을 저장 */
int nGCM = 0; /* 최대공약수 */
int nLCM = 0; /* 최소공배수 */

while(1)
{
/* 수를 입력 받는 부분 */
printf("두 수를 입력하시오 <예) 43 22> : ");
scanf("%d,%d",&nInputNumber1,&nInputNumber2);

/* 입력이 잘 못 된 경우 */
if(nInputNumber1==0 || nInputNumber2==0)
{
printf("This program not support 0 to input.\r\nPlease input other number again.");
break;
}

/* 입력 받은 수를 최대공약수 계산을 위해 임시 저장 */
nNumberBuffer1 = nInputNumber1;
nNumberBuffer2 = nInputNumber2;

/* 최대 공약수 계산 부분 */
/* 최대 공약수를 구하는 알고리즘은(X/Y)%Z == 0 일때까지 무한 반복 */
/* 만약 X=69 이고 Y=93이면 나머지 69가 된다 */
/* X Y Z */
/* 69 93 69 */
/* ↙ ↙ */
/* 93 69 24 */
/* ↙ ↙ */
/* 69 24 21 */
/* ↙ ↙ */
/* 24 21 3 */
/* ↙ ↙ */
/* 21 3 0 */
/* ↙ */
/* 최대공약수 : 3 */

for(;;)
{
/* 두 수를 나눈 나머지 계산*/
nBalance = nNumberBuffer1 % nNumberBuffer2;

/* 나머지가 0이면 GCM을 찾은 것이다 */
if(nBalance == 0)
{
break;
}
/* 그렇지 않으면 알고리즘에 의해 Y를 X에 넣고 */
/* 나머지를 Y에 넣고 재계산한다 */
else
{
nNumberBuffer1 = nNumberBuffer2;
nNumberBuffer2 = nBalance;
}
}
nGCM = nNumberBuffer2;

/* 최소공배수는 두 수를 곱하고 최대공약수로 나누면 된다 */
nLCM = nInputNumber1 * nInputNumber2 / nGCM;

printf("%d\t%d\r\n", nGCM, nLCM);
}
return 0;
}

얼마나 아름다운 코드입니까.. 코드를 모두 제거하고
주석만 남기더라도 뭐하는 프로그램인지, 어떻게 돌아가는지,
어떻게 코드를 짜야하는지 금방 알 수 있습니다.
이것이 바로 아름다운 1줄의 예술이라는 것입니다.
7. 맺음말..
더 이상 할 말이 없습니다. 앞에서 얘기한 6가지 원칙을 지키면
누구라도 한 줄의 예술을 할 수 있는 것입니다. 거만하게도 여러분들이
저를 보시기에 제가 고수라고 느끼실런지 몰라도 저 역시 하수일 뿐입니다.
진정한 고수라고 생각되는 사람이 저희 회사에 한 명이 있는데
그 양반 코드는 아주 소설입니다. 프로그래밍을 할 때 아주 소설책을 씁니다.
줄줄이 소설을 쓴 뒤에는 그 소설을 코멘트로 해서 코드를 써갑니다.
코드를 작성한다기 보다 쓴다고 해야 옳은 표현일겁니다.
물론 프로그래밍 마인드가 있어야 가능한 일이기는 합니다.
어쨌거나 코멘트를 바탕으로 코드를 작성해야 하는 원칙은 어느 언어든지
틀린 이야기가 아닙니다. 어렸을 때 전산학원을 다닌 사람들은 알 겁니다.
프로그램 언어를 배우기 전에 순서도를 배우죠. 그리고 순서도를 먼저
그리고 그것으로 프로그래밍하라고 배우죠. 사실은 순서도를 코멘트로
변환하고 그것으로 코딩해야 하지만 말입니다..
에이 모르겠다.. 역시 저는 글쟁이는 안되나 봅니다.
찝찝하더라도 여기서 끝입니다. -끝-

이 글에는 트랙백을 보낼 수 없습니다

Application_developing/C 2007/08/03 13:58

알고리즘

알고리즘·자료구조 학습에서의 문제
우리는 알고리즘 카탈로그를 배웁니다. 이미 그러한 해법이 존재하고, 그것이 최고이며, 따라서 그것을 달달 외우고 이해해야 합니다. 좀 똑똑한 친구들은 종종 "이야 이거 정말 기가 막힌 해법이군!"하고 감탄할지도 모릅니다. 대부분의 나머지 학생들은 그 해법을 이해하려고 머리를 쥐어짜고 한참을 씨름한 후에야 어렴풋이 왜 이 해법이 그 문제를 해결하는지 납득하게 됩니다.

그리고는 그 '증명'은 책 속에 덮어두고 까맣게 사라져버립니다. 앞으로는 그냥 '사용'하면 되는 것입니다. 더 많은 대다수의 학생은 이 과정이 무의미하다는 것을 알기 때문에 왜 이 해법이 이 문제를 문제없이 해결하는지의 증명은 간단히 건너뜁니다.

이런 학생들은 이미 주어진 알고리즘을 사용하는 일종의 객관식 혹은 문제 출제자가 존재하는 시험장 상황에서는 뛰어난 성적을 보일 것임은 자명합니다. 하지만 스스로가 문제와 해답을 모두 만들어내야 하는 상황이라면, 또는 해답이 존재하지 않을 가능성이 있는 상황이라면, 혹은 최적해를 구하는 것이 불가능에 가깝다면, 혹은 알고리즘을 완전히 새로 고안해내야 하거나 기존 알고리즘을 변형해야 하는 상황이라면 어떨까요?

교육은 물고기를 잡는 방법을 가르쳐야 합니다. 어떤 알고리즘을 배운다면 그 알고리즘을 고안해낸 사람이 어떤 사고 과정을 거쳐 그 해법에 도달했는지를 구경할 수 있어야 하고, 학생은 각자 스스로만의 해법을 차근차근 '구성'(construct)할 수 있어야 합니다(이를 교육철학에서 구성주의라고 합니다. 교육철학자 삐아제(Jean Piaget)의 제자이자, 마빈 민스키와 함께 MIT 미디어랩의 선구자인 세이머 페퍼트 박사가 주창했습니다). 전문가가 하는 것을 배우지 말고, 그들이 어떻게 전문가가 되었는지를 배우고 흉내 내야 합니다.

결국은 소크라테스적인 대화법입니다. 해답을 가르쳐 주지 않으면서도 초등학교 학생이 자신이 가진 지식만으로 스스로 퀵소트를 유도할 수 있도록 옆에서 도와줄 수 있습니까? 이것이 우리 스스로와 교사들에게 물어야 할 질문입니다.

왜 우리는 학교에서 '프로그래밍을 하는 과정'이나 '디자인 과정'(소프트웨어 공학에서 말하는 개발 프로세스가 아니라 몇 시간이나 몇 십 분 단위의, 개인적인 차원의 사고 과정 등을 일컫습니다)을 명시적으로 배운 적이 없을까요? 왜 해답에 이르는 과정을 가르쳐주는 사람이 없나요? 우리가 보는 것은 모조리 이미 훌륭히 완성된, 종적 상태의 결과물로서의 프로그램뿐입니다. 어느 날 문득 하늘에서 완성된 프로그램이 뚝 떨어지는 경우는 없는데 말입니다.

교수가 어떤 알고리즘 문제의 해답을 가르칠 때, "교수님, 교수님께서는 어떤 사고 과정을 거쳐, 그리고 어떤 디자인 과정과 프로그래밍 과정을 거쳐서 그 프로그램을 만드셨습니까?"하고 물어봅시다. 만약 여기에 어떤 체계적인 답변도 할 수 없는 분이라면 그 분은 자신의 사고에 대해 '사고'해 본 적이 없거나 문제 해결에 어떤 효율적 체계를 갖추지 못한 분이며, 따라서 아직 남을 가르칠 준비가 되어있지 않은 분일 것입니다. 만약 정말 그렇다면 우리는 어떻게 해야 할까요?

자료구조와 알고리즘 공부
제가 생각건대, 교육적인 목적에서는 자료구조나 알고리즘을 처음 공부할 때 우선은 특정 언어로 구현된 것을 보지 않는 것이 좋을 때가 많습니다. 대신 말로 된 설명이나 의사코드(pseudo-code) 등으로 그 개념까지만 이해하는 것이죠. 그 아이디어를 절차형(C, 어셈블리어)이나 함수형(LISP, Scheme, Haskell), 객체지향(자바, 스몰토크) 언어 등으로 직접 구현해 보는 겁니다. 그 다음에는 다른 사람이나 다른 책의 코드와 비교합니다. 이 경험을 애초에 박탈당한 사람은 귀중한 배움과 깨달음의 기회를 잃은 셈입니다.

만약 여러 사람이 함께 공부한다면 각자 동일한 아이디어를 같은 언어로 혹은 다른 언어로 어떻게 다르게 표현했는지를 서로 비교해 보면 배우는 것이 무척 많습니다.

우리가 자료구조나 알고리즘을 공부하는 이유는, 특정 '실세계의 문제'를 어떠한 '수학적 아이디어'로 매핑시켜 해결할 수 있는지, 그것이 효율적인지, 또 이를 컴퓨터에 어떻게 효율적으로 구현할 수 있는지 따지고, 그것을 실제로 구현하기 위해서입니다. 따라서 이 과정에 있어 실세계의 문제를 수학 문제로, 그리고 수학적 개념을 프로그래밍 언어로 효율적으로 표현해내는 것은 아주 중요한 능력이 됩니다.

알고리즘 공부에서 중요한 것
개별 알고리즘의 목록을 이해, 암기하며 익히는 것도 중요하지만 더 중요한 것은 다음 네 가지입니다.
①알고리즘을 스스로 생각해낼 수 있는 능력
②다른 알고리즘과 효율을 비교할 수 있는 능력
③알고리즘을 컴퓨터와 다른 사람이 이해할 수 있는 언어로 표현해낼 수 있는 능력
④이것의 정상작동(correctness) 여부를 검증해 내는 능력

첫 번째가 제대로 훈련되지 못한 사람은 알고리즘 목록의 스테레오 타입에만 길들여져 있어서 모든 문제를 자신이 아는 알고리즘 목록에 끼워 맞추려고 합니다. 디자인패턴을 잘못 공부한 사람과 비슷합니다. 이런 사람들은 마치 과거에 수학 정석만 수십 번 공부해 문제를 하나 던져주기만 하면, 생각해보지도 않고 자신이 풀었던 문제들의 패턴 중 가장 비슷한 것 하나를 기계적·무의식적으로 풀어제끼는 문제풀이기계와 비슷합니다. 그들에게 도중에 물어보십시오. "너 지금 무슨 문제 풀고 있는 거니?" 열심히 연습장에 뭔가 풀어나가고는 있지만 그들은 자신이 뭘 풀고 있는지도 제대로 인식하지 못 하는 경우가 많습니다.

머리가 푸는 게 아니고 손이 푸는 것이죠. 이렇게 되면 도구에 종속되는 '망치의 오류'에 빠지기 쉽습니다. 새로운 알고리즘을 고안해야 하는 상황에서도 기존 알고리즘에 계속 매달릴 뿐입니다. 알고리즘을 새로 고안해 내건 혹은 기존의 것을 조합하건 스스로 생각해 내는 훈련이 필요합니다.

두 번째가 제대로 훈련되지 못한 사람은 일일이 구현해 보고 실험해 봐야만 알고리즘 간의 효율을 비교할 수 있습니다. 특히 자신이 가진 카탈로그를 벗어난 알고리즘을 만나면 이 문제가 생깁니다. 이건 상당한 대가를 치르게 합니다.

세 번째가 제대로 훈련되지 못한 사람은, 문제를 보면 "아, 이건 이렇게 저렇게 해결하면 됩니다"하는 말은 곧잘 할 수 있지만 막상 컴퓨터 앞에 앉혀 놓으면 아무 것도 하지 못합니다. 심지어 자신이 생각해낸 그 구체적 알고리즘을 남에게 설명해 줄 수는 있지만, 그걸 '컴퓨터에게' 설명하는 데는 실패합니다. 뭔가 생각해낼 수 있다는 것과 그걸 컴퓨터가 이해할 수 있게 설명할 수 있다는 것은 다른 차원의 능력을 필요로 합니다.

네 번째가 제대로 훈련되지 못한 사람은, 알고리즘을 특정 언어로 구현해도, 그것이 옳다는 확신을 할 수 없습니다. 임시변통(ad hoc)의 아슬아슬한 코드가 되거나 이것저것 덧붙인 누더기 코드가 되기 쉽습니다. 이걸 피하려면 두 가지 훈련이 필요합니다.

하나는 수학적·논리학적 증명의 훈련이고, 다른 하나는 테스트 훈련입니다. 전자가 이론적이라면 후자는 실용적인 면이 강합니다. 양자는 상보적인 관계입니다. 특수한 경우들을 개별적으로 테스트해서는 검증해야 할 것이 너무 많고, 또 모든 경우에 대해 확신할 수 없습니다. 테스트가 버그의 부재를 보장할 수는 없습니다. 하지만 수학적 증명을 통하면 그것이 가능합니다. 또, 어떤 경우에는 수학적 증명을 굳이 할 필요 없이 단순히 테스트 케이스 몇 개만으로도 충분히 안정성이 보장되는 경우가 있습니다. 이럴 때는 그냥 테스트만으로 만족할 수 있습니다.

실질적이고 구체적인 문제를 함께 다루라
자료구조와 알고리즘 공부를 할 때에는 가능하면 실질적이고 구체적인 실세계의 문제를 함께 다루는 것이 큰 도움이 됩니다. 모든 학습에 있어 이는 똑같이 적용됩니다. 인류의 지성사를 봐도, 구상(concrete) 다음에 추상(abstract)이 옵니다. 인간 개체 하나의 성장을 봐도 그러합니다. 'be-동사 더하기 to-부정사'가 예정으로 해석될 수 있다는 룰만 외우는 것보다 다양한 예문을 실제 문맥 속에서 여러 번 보는 것이 훨씬 나을 것은 자명합니다. 알고리즘과 자료구조를 공부할 때 여러 친구들과 함께 연습문제(특히 우리가 경험하는 실세계의 대상들과 관련이 있는 것)를 풀어보기도 하고, ACM의 ICPC(International Collegiate Programming Contest: 세계 대학생 프로그래밍 경진 대회) 등의 프로그래밍 경진 대회 문제 중 해당 알고리즘·자료구조가 사용될 수 있는 문제를 같이 풀어보는 것도 아주 좋습니다. 이게 가능하려면 "이 알고리즘이 쓰이는 문제는 이거다"하고 가이드를 해줄 사람이 있으면 좋겠죠. 이것은 그 구체적 알고리즘·자료구조를 훈련하는 것이고, 이와 동시에 어떤 알고리즘을 써야할지 선택, 조합하는 것과 새로운 알고리즘을 만들어내는 훈련도 무척 중요합니다.

알고리즘 디자인 과정의 중요성
알고리즘을 좀더 수월하게, 또 잘 만들려면 알고리즘 디자인 과정에 대해 생각해 봐야 합니다. 그냥 밑도 끝도 없이 문제를 쳐다본다고 해서 알고리즘이 튀어나오진 않습니다. 체계적이고 효율적인 접근법을 사용해야 합니다. 대표적인 것으로 다익스트라(E. W. Dijkstra)와 워스(N. Wirth)의 '조금씩 개선하기'(Stepwise Refinement)가 있습니다. 워스의 「Program Development by Stepwise Refinement」(1971, CACM 14.4, http://www.acm.org/classics/dec95)를 꼭 읽어보길 바랍니다. 여기 소개된 조금씩 개선하기는 구조적 프로그래밍에서 핵심적 역할을 했습니다(구조적 프로그래밍을 'goto 문 제거' 정도로 생각하면 안 됩니다). 다익스트라의 「Stepwise Program Construction」(Selected Writings on Computing: A Personal Perspective, Springer-Verlag, 1982, http://www.cs.utexas.edu/users/EWD/ewd02xx/EWD227.PDF)도 추천합니다.

알고리즘 검증은 알고리즘 디자인과 함께 갑니다. 새로운 알고리즘을 고안할 때 검증해 가면서 디자인하기 때문입니다. 물론 가장 큰 역할은 고안이 끝났을 때의 검증입니다. 알고리즘 검증에는 루프 불변식(loop invariant) 같은 것이 아주 유용합니다. 아주 막강한 무기입니다. 익혀 두면 두고두고 가치를 발휘할 것입니다. 맨버(Udi Manber)의 알고리즘 서적(『Introduction to Algorithms: A Creative Approach』)이 알고리즘 검증과 디자인이 함께 진행해 가는 예로 자주 추천됩니다. 많은 계발을 얻을 것입니다. 고전으로는 다익스트라의 『A Discipline of Programming』과 그라이스(Gries)의 『The Science of Programming』이 있습니다. 특히 전자를 추천합니다. 프로그래밍에 대한 관을 뒤흔들어 놓을 것입니다.

알고리즘과 패러다임
알고리즘을 공부하면 큰 줄기들을 알아야 합니다. 개별 테크닉도 중요하지만 '패러다임'이라고 할만한 것들을 알아야 합니다. 이것에 대해서는 튜링상을 수상한 로버트 플로이드(Robert Floyd)의 튜링상 수상 강연(The Paradigms of Programming, 1978)을 추천합니다. 패러다임을 알아야 알고리즘을 상황에 맞게 마음대로 변통할 수 있습니다. 그리고 자신만의 분류법을 만들어야 합니다. 구체적인 문제들을 케이스 바이 케이스로 여럿 접하는 동안 그냥 지나쳐 버리면 개별자는 영원히 개별자로 남을 뿐입니다. 비슷한 문제들을 서로 묶어서 일반화해야 합니다.

이런 패러다임을 발견하려면 '다시 하기'가 아주 좋습니다. 다른 것들과 마찬가지로, 이 다시 하기는 알고리즘에서만이 아니고 모든 공부에 적용할 수 있습니다. 같은 것을 다시 해보는 것에서 우리는 얼마나 많은 것을 배울 수 있을까요. 대단히 많습니다. 왜 동일한 문제를 여러 번 풀고, 왜 같은 내용의 세미나에 또 다시 참석하고, 같은 프로그램을 거듭 작성할까요? 훨씬 더 많이 배울 수 있기 때문입니다. 화술 교육에서는 같은 주제에 대해 한 번 말해본 연사와 두 번 말해본 연사는 천지 차이가 있다고 말합니다. 같은 일에 대해 두 번의 기회가 주어지면 두 번째에는 첫 번째보다 잘 할 기회가 있습니다. 게다가 첫 번째 경험했던 것을 '터널을 벗어나서' 다소 객관적으로 볼 수 있게 됩니다. 왜 자신이 저번에 이걸 잘 못 했고, 저걸 잘 했는지 알게 되고, 어떻게 하면 그걸 더 잘할 수 있을는지 깨닫게 됩니다. 저는 똑같은 문제를 여러 번 풀더라도 매번 조금씩 다른 해답을 얻습니다. 그러면서 정말 엄청나게 많은 것을 배웁니다. '비슷한 문제'를 모두 풀 능력이 생깁니다.

제가 개인적으로 존경하는 전산학자 로버트 플로이드(Robert W. Floyd)는 1978년도 튜링상 강연에서 다음과 같은 말을 합니다.

제가 어려운 알고리즘을 디자인하는 경험을 생각해 볼 때, 제 능력을 넓히는 데 가장 도움이 되는 특정한 테크닉이 있습니다. 어려운 문제를 푼 후에, 저는 그것을 처음부터 완전히 새로 풉니다. 좀 전에 얻은 해법의 통찰(insight)만을 유지하면서 말이죠. 해법이 제가 희망하는 만큼 명료하고 직접적인 것이 될 때까지 반복합니다. 그런 다음, 비슷한 문제들을 공략할 어떤 일반적인 룰을 찾습니다. 아까 주어진 문제를 아예 처음부터 최고로 효율적인 방향에서 접근하도록 이끌어줬을 그런 룰을 찾는 거죠. 많은 경우에 그런 룰은 불변의 가치가 있습니다. … 포트란의 룰은 몇 시간 내에 배울 수 있습니다. 하지만 관련 패러다임은 훨씬 더 많은 시간이 걸립니다. 배우거나(learn) 배운 것을 잊거나(unlearn) 하는 데 모두.

수학자와 프로그래머를 포함한 모든 문제 해결자들의 고전이 된 죠지 폴리야(George Polya)의 『How to Solve it』에는 이런 말이 있습니다:

심지어는 꽤나 훌륭한 학생들도, 문제의 해법을 얻고 논증을 깨끗하게 적은 다음에는 책을 덮어버리고 뭔가 다른 것을 찾는다. 그렇게 하는 동안 그들은 그 노력의 중요하고도 교육적인 측면을 잃어버리게 된다. … 훌륭한 선생은 어떠한 문제이건 간에 완전히 바닥이 드러나는 경우는 없다는 관점을 스스로 이해하고 또 학생들에게 명심시켜야 한다.

저는 ACM의 ICPC 문제 중에 어떤 문제를 이제까지 열 번도 넘게 풀었습니다. 대부분 짝 프로그래밍이나 세미나를 통해 프로그래밍 시연을 했던 것인데, 제 세미나에 여러 번 참석한 분이 농담조로 웃으며 물었습니다. "신기해요. 창준 씨는 그 문제를 풀 때마다 다른 프로그램을 짜는 것 같아요. 설마 준비를 안 해 와서 그냥 내키는 대로 하는 건 아니죠?" 저는 카오스 시스템과 비슷하게 초기치 민감도가 프로그래밍에도 작용하는 것 같다고 대답했습니다. 저 스스로 다른 해법을 시도하고 싶은 마음이 있으면 출발이 조금 다르고, 또 거기서 나오는 진행 방향도 다르게 됩니다. 그런데 중요한 것은 이렇게 같은 문제를 매번 다르게 풀면서 배우는 것이 엄청나게 많다는 점입니다. 저는 매번, 전보다 개선할 것을 찾아내게 되고, 또 새로운 것을 배웁니다. 마치 마르지 않는 샘물처럼 계속 생각할 거리를 준다는 점이 참 놀랍습니다.

알고리즘 개론 교재로는 CLR(Introduction to Algorithms, Thomas H. Cormen, Charles E. Leiserson, and Ronald L. Rivest)을 추천합니다. 이와 함께 혹은 이 책을 읽기 전에 존 벤틀리(Jon Bentley)의 『Programming Pearls』도 강력 추천합니다. 세계적인 짱짱한 프로그래머와 전산학자들이 함께 꼽은 위대한 책 목록에서 몇 손가락 안에 드는 책입니다. 아직 이 책을 본 적이 없는 사람은 축하합니다. 아마 몇 주간은 감동 속에 하루하루를 보내게 될 겁니다.

리팩토링 학습에서의 문제
먼저, 본지 2001년 11월호에 제가 썼던 마틴 파울러의 책을 추천하는 글을 인용하겠습니다.

OOP를 하건 안 하건 프로그래밍이란 업을 하는 사람이라면 이 책은 자신의 공력을 서너 단계 레벨업시켜줄 수 있다. 자질구레한 술기를 익히는 것이 아니고 기감과 내공을 증강하는 것이다.

혹자는 DP 이전에 RF를 봐야 한다고도 한다. 이 말이 어느 정도 일리가 있는 것이, 효과적인 학습은 문제의식이 선행해야 하기 때문이다. DP는 거시적 차원에서 해결안을 모아놓은 것이다. RF를 보고 나쁜 냄새(Bad Smell)를 맡을 수 있는 후각을 발달시켜야 한다. RF의 목록을 모두 외우는 것은 큰 의미가 없다. 그것보다 냄새나는 코드를 느낄 수 있는 감수성을 키우는 것이 더 중요하다. 필자는 일주일에 한 가지씩 나쁜 냄새를 정해놓고 그 기간 동안에는 자신이 접하는 모든 코드에서 그 냄새만이라도 확실히 맡도록 집중하는 방법을 권한다. 일명 일취집중후각법. 패턴 개념을 만든 건축가 크리스토퍼 알렉산더나 GoF의 랄프 존슨은 좋은 디자인이란 나쁜 것이 없는 상태라고 한다. 무색 무미 무취의 무위(無爲)적 자연(自然) 코드가 되는 그 날을 위해 오늘도 우리는 리팩토링이라는 유위(有爲)를 익힌다.

주변에서 리팩토링을 잘못 공부하는 경우를 종종 접합니다. 어떤 의미에서 잘못 공부한다고 할까요? '실체화'가 문제입니다. 리팩토링은 도구이고 방편일 뿐인데, 그것에 매달리는 것은 달은 보지 않고 손을 보는 것과 같습니다. 저는 리팩토링 책이 또 하나의 (이미 그 병폐가 많이 드러난) GoF 책이 되는 현상이 매우 걱정됩니다.

리팩토링 공부
사람들이 일반적으로 생각하는 바와는 달리 리팩토링 학습에 있어 어떤 리팩토링이 있는지, 구체적으로 무엇인지 등의 리팩토링 목록에 대한 지식과 각각에 대한 메카닉스(Mechanics: 해당 리팩토링을 해나가는 구체적 단계)는 오히려 덜 중요할 수 있습니다. 더 기본적이고 유용한 것은 코드 냄새(Code Smell)와 짧은 테스트-코드 싸이클입니다. 이것만 제대로 되면 리팩토링 책의 모든 리팩토링을 스스로 구성해낼 수 있으며, 다른 종류의 리팩토링까지 직접 발견해낼 수 있습니다.

그 책에서는 테스트의 중요성이 충분히 강조되지 않았습니다. 하지만 테스트 코드 없는 리팩토링은 안전벨트 없는 자동차 경주와 같습니다. 그리고 테스트 코드가 리팩토링의 방향을 결정하기도 합니다. 양자는 음과 양처럼 서로 엮여 있습니다. 이런 의미에서 리팩토링은 TDD(Test Driven Development)와 함께 수련하는 것이 좋습니다. 훨씬 더 빨리, 훨씬 더 많은 것을 배울 수 있을 겁니다.

리팩토링을 공부할 때는 우선 코드 냄새의 종류를 알고, 왜 그것이 나쁜 냄새가 될 수 있는지 이해하고(그게 불가하다면 리팩토링 공부를 미뤄야 합니다) 거기에 동의할 수 있어야 합니다. 그런 다음, 대충 어떤 종류의 리팩토링이 가능한지 죽 훑어봅니다. 그 중 몇 개는 메카닉스를 따라가면서 실험해 봅니다. 이제는 책을 덮습니다. 그리고 실제 코드를 접하고, 만약 거기에서 냄새를 맡는다면 이 냄새를 없애기 위해 어떻게 해야 할지 스스로 고민합니다. 리팩토링 책의 목록은 일단 무시하십시오. 그 냄새를 없애는 것이 목표이지, 어떤 리팩토링을 여기에 '써먹는 것'이 목표가 되어선 안 됩니다. 이 때, 반드시 테스트 코드가 있어야 합니다. 그래야 '좋은' 리팩토링을 할 수 있습니다. 책을 떠나 있으면서도 책에서 떠나지 않는 방법입니다.

리팩토링을 하기 전에 초록색 불(테스트가 모두 통과)이 들어온 시점에서 코드를 일부 고치면 빨간 불(테스트가 하나라도 실패)로 바뀔 겁니다. 이게 다시 초록색 불이 될 때까지 최소한도의 시간이 걸리도록 하십시오. 현 초록색에서 다음 초록색까지 최소한의 시간을 소비하도록 하면서 코드와 테스팅을 오가게 되면 자기도 모르는 사이에 훌륭한 리팩토링이 자발공으로 터져 나옵니다. 여기서 목적지는 우선은 OAOO(Once And Only Once: 모든 것은 오로지 한 번만 말해야 한다)를 지키는 쪽으로 합니다. 그러면 OAOO와 짧은 테스트-코드 싸이클을 지키는 사이 어느새 탁월한 디자인이 튀어나옵니다. 참고로 저는 '모래시계 프로그래밍'이란 걸 가끔 합니다. 모래시계나 알람 프로그램으로 테스트-코드 사이클의 시간을 재는 것입니다. 그래서 가급적이면 한 사이클이 3분 이내(대부분의 모래시계는 단위가 3분입니다)에 끝나도록 노력합니다. 여기서 성공을 하건 실패를 하건 많은 걸 얻습니다.

리팩토링 수련법
제가 고안, 사용한 몇 가지 리팩토링 수련법을 알려드립니다.

①일취집중후각법: 앞에 소개한 본지 2001년 11월호에서 인용된 글 참조
②주석 최소화: 주석을 최소화하되 코드의 가독성이 떨어지지 않도록(혹은 오히려 올라가도록) 노력합니다. 이렇게 하면 자동으로 리팩토링이 이뤄지는 경우가 많습니다.
③OAOO 따르기: OAOO 법칙을 가능하면 최대한, 엄격하게 따르려고 합니다. 역시 자동으로 좋은 리팩토링이 이뤄집니다. 여기서 디자인패턴이 창발하기도 합니다. GoF 책을 한번도 보지 못한 사람이 디자인패턴을 자유자재로 부리는 경우를 보게 됩니다.
④디미터 법칙(Law of Demeter) 따르기: 디미터 법칙을 가능하면 지키려고 합니다. 어떤 리팩토링이 저절로 이뤄지거나 혹은 필요 없어지는가요?
⑤짝(Pair) 리팩토링: 함께 리팩토링합니다. 혼자 하는 것보다 더 빨리, 더 많은 걸 배우게 됩니다. 특히, 각자 작성했던 코드를 함께 리팩토링하고, 제3자의 코드를 함께 리팩토링해 봅니다. 사람이 많다면 다른 짝이 리팩토링한 것과 서로 비교하고 토론합니다.
⑥'무엇'과 '어떻게'를 분리: 어떻게에서 무엇을 분리해 내도록 합니다. 어떤 리팩토링이 창발합니까?

여기서 번, 짝 리팩토링은 아주 효과적인 방법입니다. 저는 이것을 협동적 학습(Collaborative Learning)이라고 부릅니다. 상대가 나보다 더 많이 아는 경우만이 아니고, 서로 아는 것이 비슷해도 많은 양의 학습이 발생합니다. 특히, 전문가와 함께 짝 프로그래밍을 하면 무서울 만큼 빠른 학습을 경험할 수 있습니다. 저와 짝 프로그래밍을 한 사람이 학습하는 속도에서 경이감을 느낀 적이 한두 번이 아닙니다. 문화는 사회적으로 학습되는 것입니다. 우리가 지식이라고 하는 것의 상당 부분은 문화처럼 암묵적인 지식(Tacit Knowledge)입니다. 전문가가 문제가 생겼을 때 어떻게 사고하고, 어떤 과정을 거쳐 접근하며, 어떻게 디버깅하고, 키보드를 어떤 식으로 누르는지, 사고 도구로 무엇을 사용하는지, 일 계획과 관리를 어떻게 하는지, 동료와 어떻게 대화하는지 등은 성문화되어 있지 않습니다. 그러나 이런 것들은 아주 중요합니다. 프로페셔널의 하루 일과의 대부분을 이루고 있기 때문입니다. 이런 것들은 전문가 바로 옆에서 조금씩 일을 도와주면서 배워야 합니다. 도제 살이(Apprenticeship)입니다. 진정한 전문가는 모든 동작이 우아합니다. 마치 춤을 추는 것 같습니다. 이 모습을 바로 곁에서 지켜보게 되면, 주니어는 한마디로 엄청난 충격을 받습니다. 그리고 스펀지처럼 빨아들이게 됩니다. 도대체 이 경험을 책이나 공장화한 학교가 어떻게 대신하겠습니까. 이와 관련해서는 레이브와 웽거(Jean Lave, Etienne Wenger)의 『Situated Learning : Legitimate Peripheral Participation』을 강력 추천합니다. 이 글을 보는 모든 교육 종사자들이 꼭 읽어봤으면 합니다. 이 협동적 학습은 두 사람만이 아니고 그룹 단위로도 가능합니다. 스터디에 이용하면 재미와 유익함 일석이조를 얻습니다.

이 외에, 특히(어쩌면 가장) 중요한 것은 일취집중후각법 등을 이용해 자신의 코드 후각의 민감도를 높이는 것입니다. 코드 후각의 메타포 말고도 유사한 메타포가 몇 가지 더 있습니다. 켄트 벡은 코드의 소리를 들으라고 하고, 저는 코드를 향해 대화하라고 합니다. 코드의 소리를 듣는 것은 코드가 원하는 것에 귀를 기울이는 것을 말합니다. 코드는 단순해지려는 욕망이 있습니다. 그걸 이뤄주는 것이 프로그래머입니다. 그리고 짝 프로그래밍을 할 때 두 사람의 대화를 코드에 남기도록 합니다. 주석이 아닙니다.

우리 프로그래머들은 항상 공부해야 합니다. 우리는 지식을 중요하게 여깁니다. 하지만 지식에 대한 지식, 즉 내가 그 지식을 얻은 과정이나 방법 같은 것은 소홀히 여기기 쉽습니다. 따라서 지식의 축적과 공유는 있어도 방법론의 축적과 공유는 매우 드문 편입니다. 저는 평소에 이런 생각에서 학교 후배들을 위해 제 자신의 공부 경험을 짬짬이 글로 옮겨놓았고, 이번 기회에 그 글들을 취합, 정리하게 되었습니다. 그 결실이 바로 이 글입니다

이 글은 공부하는 방법과 과정에 관한 글입니다. 이 글은 제가 공부한 성공/실패 경험을 기본 토대로 했고, 지난 몇 년간 주변에서 저보다 먼저 공부한 사람들의 경험을 관찰, 분석한 것에 제가 다시 직접 실험한 것과 그밖에 오랫동안 꾸준히 모아온 자료들을 더했습니다. '만약 다시 공부한다면' 저는 이렇게 공부할 것입니다.

부디 독자 제현께서 이 글을 씨앗으로 삼아 자신만의 나무를 키우고 거기서 열매를 얻고, 또 그 열매의 씨앗이 다시 누군가에게 전해질 수 있다면 더 이상 바랄 것이 없겠습니다.

이 글은 특정 주제들의 학습/교수법에 대한 문제점과 제가 경험한 좋은 공부법을 소개하는 식으로 구성됐습니다. 여기에 선택된 과목은 리팩토링, 알고리즘·자료구조, 디자인패턴, 익스트림 프로그래밍(Extreme Programming 혹은 XP) 네 가지입니다.

이 네 가지가 선택된 이유는 필자가 관심있게 공부했던 것이기 때문만은 아니고, 모든 프로그래머에게 어떻게든 널리 도움이 될만한 교양과목이라 생각하여 선택한 것입니다. 그런데 이 네 가지의 순서가 겉보기와는 달리 어떤 단계적 발전을 함의하는 것은 아닙니다. 수신(修身)이 끝나면 더 이상 수신은 하지 않고 제가(齊家)를 한다는 것이 어불성설인 것과 같습니다.

원래는 글 후미에 일반론으로서의 공부 패턴들을 쓰려고 했습니다. 하지만 지면의 제약도 있고, 독자 스스로 이 글에서 그런 패턴을 추출하는 것도 의미가 있을 것이기에 생략했습니다. 그런 일반론이 여기 저기 숨어있기 때문에 알고리즘 공부에 나온 방법 대부분이 리팩토링 공부에도 적용할 수 있고, 적용되어야 한다는 점을 꼭 기억해 주셨으면 합니다. 다음에 기회가 닿는다면 제가 평소 사용하는 (컴퓨터) 공부패턴들을 소개하겠습니다

복잡도(complexity)의 개념

알고리즘의 성능분석에 있어서의 복잡도(complexity)의 개념에 대해 살펴
보고 공간복잡도(space complexity)와 시간복잡도(time complexity)에 대해
알아본다.

4.1 알고리즘의 성능분석과 복잡도(complexity)
4.2 공간 복잡도(space complexity)
4.3 시간 복잡도(time complexity)

4.1 알고리즘의 성능분석과 복잡도(complexity)
앞 장에서도 언급했듯이 알고리즘은 유한한 횟수의 명령어들을 정해진
순서에 의하여 수행한 다음 언젠가는 반드시 종료되어야 한다.(유한성) 따
라서 알고리즘은 일단 시작된 다음 종료될 때까지의 실행시간이 이치에
맞지 않게 너무 길어서는 안된다. 장기 혹은 바둑과 같은 게임에 대한 알고
리즘을 예를 들어 생각해 볼 수 있다. 이와 같은 게임의 알고리즘을 개발할
때, 게임중 발생할 수 있는 모든 경우를 조사하여 알고리즘을 구성할 수 있
다. 그러나 이러한 방법으로는 알고리즘의 개발 조차도 불가능할 것이며,
그런 방법에 의한 알고리즘의 실행시 얼마만의 시간 안에 계산을 종료할
지도 추측할 수 없게 된다.
따라서, 일반적인 알고리즘은 상식적인 시간 안에 실행을 종료할 수 있어
야 하며, 가능한 한 빠른 시간내에 실행을 종료할 수 있어야 한다. 이러한
관점에서 알고리즘의 성능을 분석하기 위해 시간 복잡도(time complexity)
라는 개념을 사용하게 된다.
알고리즘의 성능 측정을 위한 수단에는 위에 소개된 시간 복잡도(time
complexity) 외에 공간 복잡도(space complexity)도 있다. 시간 복잡도가 알
고리즘의 실행시간을 의미한다면 공간 복잡도는 알고리즘을 수행시키기
위해 필요한 기억장치(memory)의 크기를 의미한다.

[정의4.1] 복잡도(complexity)
시간 복잡도(time complexity): 알고리즘을 실행하여 종료할때까지 필요한
시간
공간 복잡도(space complexity): 알고리즘을 실행하여 종료할때까지 필요
한 기억장치의 크기

4.2 공간 복잡도(space complexity)
주어진 알고리즘을 실행시키기 위해 필요한 기억장치(space)는 다음과
같이 두 가지로 분류해 볼 수 있다.
알고리즘과 무관한 부분: 알고리즘의 특성과는 무관한 부분으로 프로그램
코드를 저장하기 위한 공간, 프로그램을 수행하기 위해 시스템이 필요로
하는 공간 등이 이에 포함된다.
알고리즘과 밀접한 부분: 알고리즘의 특성과 밀접한 관계가 있는 부분으
로서 문제를 해결하기 위해 필요로 하는 공간을 의미한다. 즉, 변수를 저장
하기 위한 공간이나 순환 프로그램일 경우 순환 스택(recursion stack) 등이
이에 포함된다.
일반적으로 알고리즘의 공간 복잡도를 분석할때는 위의 두가지중 두 번
째의 것을 계산하게 된다. 즉, 알고리즘이 문제를 해결하기 위해서 반드시
필요한 부분만을 계산함으로써 주어진 알고리즘의 공간 복잡도를 계산한
다.
다음은 공간 복잡도를 구하는 예이다.
[예제4.1] 공간 복잡도의 계산 1

float abc(float a, float b, float c)
{
return(a + b + b*c + (a + b - c)/(a + b) + 4.0);
}

공간 복잡도 = 0

위의 프로그램에서 공간복잡도를 구하기 위해서 살펴볼 것은 변수 a, b,
c 이다. 따라서, float형의 변수가 한 워드(word)를 차지한다고 가정하면,
공간복잡도는 '3워드'라고 생각할 수 있다. 그러나 변수 a, b, c 는 전달되
는 인자(parameter)로서 함수 abc내에서 해결하고자 하는 문제와는 무관하
다고 볼 수 있으므로 공간 복잡도는 0이다.

[예제4.2] 공간 복잡도 계산 2
float Sum(float a[], int n)
{
float s = 0.0;
for(int i = 1; i < = n; i++)
s += a[i];
return s;
}

공간 복잡도 = n + 3

위의 프로그램에서 사용되는 변수는 a[], n, s, i 이다. 이번 예에서도 a[]
와 n은 인자로 전달 됨을 볼 수 있다. 그러나 [예제4.1]과는 다르게 변수
a[]는 합을 구하기 위하여 반복문 내에서 n개의 원소가 모두 참조되고 있
음을 볼 수 있다. 또한, n은 for-문을 벗어나기 위한 한계값으로 사용된다.
따라서 a[]와 n은 알고리즘이 해결하고자 하는 문제와 밀접한 관련이 있
다고 볼 수 있다. 그러므로 프로그램의 복잡도는 (a[]를 저장하기 위한 공
간) + (변수 n, s, I를 위한 공간) = n + 3 이 된다.

[예제4.3] 공간 복잡도 계산 3
float RSum(float a[], int n)
{
if(n <= 0)
return (0.0);
else
return (RSum(a, n-1) + a[n]);
}

공간 복잡도 = 3(n + 1)

위의 프로그램은 순환기법(resursion)으로 작성된 것이다. 위의 경우 살펴
볼 변수는 a[], n이다. 우선 변수 n은 if-문 내에서 순환의 한계값으로 사용
되고 있음을 볼 수 있다. 또한, 변수 a[]는 합을 구하기 위하여 사용되고
있으며 a[]의 원소 중에서 n번째의 원소만 필요로 한다. 따라서 변수 a[]
와 n이 모두 알고리즘과 밀접한 관계가 있으므로, 프로그램이 필요로 하는
공간은 (a[]의 n번째 원소를 의한 공간) + (n을 위한 공간) = 1 + 1 으로 볼
수 있다. 그러나 위의 프로그램은 순환기법에 의해 작성되었음을 고려해
야 한다. 즉, 프로그램이 순환적으로 실행될 것을 고려해서 몇번의 순환후
에 실행이 종료되는지(the depth of recursion)를 계산해야 하며, 또한 순환
을 위해서 필요한 복귀 주소(return address)를 저장할 공간도 계산해야 한
다. 그러므로 프로그램의 공간 복잡도는 (depth of recursion)×(a[n], n, 복
귀 주소를 위한 공간) = (n+1)×3 이 된다.

4.3 시간 복잡도(time complexity)
시간 복잡도는 알고리즘을 구성하는 명령어들이 몇 번이나 실행이 되는
지를 센 결과(frequency count)에 각 명령어의 실행시간(execution time)을
곱한 합계를 의미한다. 그러나 각 명령어의 실행시간은 특정 하드웨어 혹
은 프로그래밍 언어에 따라서 그 값이 달라질 수 있기 때문에 알고리즘의
일반적인 시간 복잡도는 명령어의 실제 실행시간을 제외한 명령어의 실행
횟수만을 고려하게 된다.
이와 같이 알고리즘을 이루는 명령어들의 실행횟수를 계산하여 알고리
즘의 시간 복잡도를 구하는 일을 알고리즘의 분석(analysis of algorithm)이
라고 한다. 또한, 알고리즘의 분석은 일반적으로 공간 복잡도 보다는 시간
복잡도를 통해서 이루어진다. 따라서 이번 강좌를 통해서 소개되는 알고
리즘들의 분석은 대부분 시간 복잡도를 이용할 것이며 특별한 언급이 없
는 한 '복잡도'는 '시간 복잡도'를 의미하게 된다.
다음은 시간 복잡도를 구하는 예이다.
[예제4.4] 시간 복잡도 계산 1 알고리즘
실행횟수

float Sum(float a[], int n)
{
float s = 0.0; 1
for(int i = 1; i <= n; I++) n+1
s += a[i]; n
return s; 1
}

명령어 총 실행횟수 = 2n + 3, 시간 복잡도 = O(n)
위 알고리즘의 총 실행 횟수는 2n+3 이 되므로 시간 복잡도는 2n+3이다.
또한 복잡도는 일반적으로 'big oh'표기법을 통해서 평균 실행시간을 표현
하게 되는데 위의 알고리즘의 경우는 'big oh' 표기법으로 O(n)이 된다. 복
잡도의 표기 방법에 대해서는 다음절에서 보다 자세히 살펴보도록 하겠다.
순환 알고리즘의 경우, 시간 복잡도를 구하기 위하여 명령어의 실행 횟수
를 세는 것은 위의 [예제4.4]와 같은 방법으로는 어렵다. 따라서 일반적으
로 순환 알고리즘의 시간 복잡도를 구하기 위해서는 순환식(recursive
formula)을 이용하게 된다. [예제4.3]에서 소개된 알고리즘을 예로 들면 다
음과 같다.

[예제4.5] 시간 복잡도 계산 2 알고리즘
실행횟수

float RSum(float a[], int n)
{
if(n <= 0) 1
return (0.0); 1
else
return (RSum(a, n-1) + a[n]); 1 + T(RSum(n-1))
}

명령어 총 실행횟수 = 2n + 1, 시간 복잡도 = O(n)
위의 알고리즘에서 사용된 순환식을 이용하면 명령어의 실행횟수는 다
음과 같음을 알 수 있다.
T(RSum(n)) = 2 ,if n = 0
= 2 + T(RSum(n-1) ,if n > 0
이와 같은 순환식은 또한 순환관계(recurrence relation)라고도 말하는데,
이러한 순환관계로부터 총 명령어 실행횟수를 구하는 방법은 다음과 같다.
T(RSum(n)) = 2 + T(RSum(n-1))
= 2 + ( 2 + T(RSum((n-2)) )
= 2 + ( 2 + ( 2 + T(RSum((n-3)) ) )
......
= 2×n + T(RSum(0))
= 2n + 2
따라서, 명령어 실행횟수의 총 합은 2n+2이고 시간 복잡도는 O(n)이다.

1. 알고리즘의 성능을 측정하기 위한 수단으로서 복잡도(complexity)를 사
용하며, 복잡도에는 알고리즘의 수행에 필요한 시간을 의미하는 시간 복
잡도(time complexity)와 필요한 기억장치(memory)의 크기를 의미하는 공
간 복잡도(space complexity)가 있다.

2. 공간 복잡도는 알고리즘이 해결하고자 하는 문제와 밀접한 관련이 있는
부분 즉, 변수를 저장하기 위한 공간, 순환 프로그램일 경우 순환
스택(recursion stack)등을 모두 합한 값으로 나타낸다.

3. 시간 복잡도는 알고리즘을 구성하는 명령어들의 실행횟수를 모두 합한
값으로 나타내며, 일반적인 알고리즘 분석은 시간 복잡도를 통하여 이루
어진다.

이 글에는 트랙백을 보낼 수 없습니다

Application_developing/C 2007/08/03 13:43

선택알고리즘

O(n) 선택 알고리즘에서 집단을 3개로 나누면?

5개로 나눌 때 O 성능 분석

Blum, Floyd, Pratt, Rivest, & Tarjan(1973)의 선택 알고리즘(정확히는 median-of-medians)은 O(n)의 성능으로 순위 통계량(i번째 원소)을 검색할 수 있게 해준다. 이 알고리즘은 n개의 원소를 1) 5개의 집단으로 나누고 각 집단의 중간값(median)을 찾아(Θ(n)) 2) 재귀적으로 _└n/5_┘ 집단의 중간값을 찾는 Select를 호출한다(T(n/5)). 3) 이 중간값을 피봇(pivot) x로 삼고, 원소를 분할(partitioning)한다(Θ(n)). 4) 최악의 경우 x보다 큰(작은) 원소가 ^┌5/2^┐·_└└n/5_┘/2_┘ = 3_└n/10_┘ ≥ n/4개 있으므로(n≥50) n - n/4 = 3n/4개에 대해 재귀호출을 하게 된다(T(3n/4)). 이상에서 T(n) = T(n/5) + T(3n/4) + Θ(n)이고, 강의노트를 참고해 점화식을 풀면 O(n)이 된다. 그런데 이때 집단을 5개가 아니라 3개로 나누면 어떻게 될까? 여전히 O(n)일까?

3개로 나눌 때 O 성능 분석

위의 1)에서 3개의 집단으로 나누고(Θ(n)) 2)에서 _└n/3_┚ 집단의 중간값을 찾으며(T(n/3)) 3)은 동일하다(Θ(n)). 4)에서 최악의 경우 x보다 큰(작은) 원소 ^┌3/2^┐·_└└n/3_┘/2_┘ = 2_└n/6_┘ 개에 대해 재귀호출을 한다. 이때의 성능이O(n)이라고 가정하고 추측확인법(guess & verify)을 적용해보자. n≥6일 때 2_└n/6_┘≥ n/6이므로, T(n) = T(n/3) + T(5n/6) + Θ(n) ≤ 1/3cn + 5/6cn + Θ(n) = 7/6cn + Θ(n) > cn이 되어 가정이 틀렸음을 알 수 있다. 즉, 3개의 집단으로 나누면 Select 알고리즘의 큰 특성인 선형 성능을 잃는다는 결론을 얻는다.

여기서 만약 O(n²)으로 가정하면 T(n) = 1/9cn² + 25/36cn² + Θ(n) = 29/36cn² + Θ(n) < cn²으로 가정을 만족한다. 더 근접한 상계를 찾기 위해 이번엔 O(n·lg n)이라 가정하면 T(n) = cn/3(lg cn – lg 3) + 5cn/6(lg cn + lg5 – lg 6) + Θ(n) = 7cn/6(lg cn) – cn/6(2lg3 - 5lg5 + 5lg6) + Θ(n) = 7/6cn(lg cn) + cn/6(lg9 – lg25 + lg7776) + Θ(n) > 7/6cn(lg cn) + Θ(n) > cn(lg cn)이므로 가정에 모순이다.

따라서 1<k≤2인 O(n^k)을 만족한다고 가정한다. 그리고 k를 1.1에서 0.1씩 높이며 대입해 나가면 1.4(소수점 자리를 늘리면 1.3725 정도)에서 최초로 가정이 만족되므로 O(n^1.4)을 찾게 된다.

이 글에는 트랙백을 보낼 수 없습니다

Application_developing/Java 2007/07/24 11:10

The Java Virtual Machine

The previous four chapters of this book gave a broad overview of Java's architecture. They showed how the Java virtual machine fits into the overall architecture relative to other components such as the language and API. The remainder of this book will focus more narrowly on the Java virtual machine. This chapter gives an overview of the Java virtual machine's internal architecture.

The Java virtual machine is called "virtual" because it is an abstract computer defined by a specification. To run a Java program, you need a concrete implementation of the abstract specification. This chapter describes primarily the abstract specification of the Java virtual machine. To illustrate the abstract definition of certain features, however, this chapter also discusses various ways in which those features could be implemented.

What is a Java Virtual Machine?

To understand the Java virtual machine you must first be aware that you may be talking about any of three different things when you say "Java virtual machine." You may be speaking of:

the abstract specification,
a concrete implementation, or
a runtime instance.

The abstract specification is a concept, described in detail in the book: The Java Virtual Machine Specification, by Tim Lindholm and Frank Yellin. Concrete implementations, which exist on many platforms and come from many vendors, are either all software or a combination of hardware and software. A runtime instance hosts a single running Java application.

Each Java application runs inside a runtime instance of some concrete implementation of the abstract specification of the Java virtual machine. In this book, the term "Java virtual machine" is used in all three of these senses. Where the intended sense is not clear from the context, one of the terms "specification," "implementation," or "instance" is added to the term "Java virtual machine".

The Lifetime of a Java Virtual Machine

A runtime instance of the Java virtual machine has a clear mission in life: to run one Java application. When a Java application starts, a runtime instance is born. When the application completes, the instance dies. If you start three Java applications at the same time, on the same computer, using the same concrete implementation, you'll get three Java virtual machine instances. Each Java application runs inside its own Java virtual machine.

A Java virtual machine instance starts running its solitary application by invoking the main() method of some initial class. The main() method must be public, static, return void, and accept one parameter: a String array. Any class with such a main() method can be used as the starting point for a Java application.

For example, consider an application that prints out its command line arguments:

// On CD-ROM in file jvm/ex1/Echo.java
class Echo {

    public static void main(String[] args) {
        int len = args.length;
        for (int i = 0; i < len; ++i) {
            System.out.print(args[i] + " ");
        }
        System.out.println();
    }
}

You must in some implementation-dependent way give a Java virtual machine the name of the initial class that has the main() method that will start the entire application. One real world example of a Java virtual machine implementation is the java program from Sun's Java 2 SDK. If you wanted to run the Echo application using Sun's java on Window98, for example, you would type in a command such as:

java Echo Greetings, Planet.

The first word in the command, "java," indicates that the Java virtual machine from Sun's Java 2 SDK should be run by the operating system. The second word, "Echo," is the name of the initial class. Echo must have a public static method named main() that returns void and takes a String array as its only parameter. The subsequent words, "Greetings, Planet.," are the command line arguments for the application. These are passed to the main() method in the String array in the order in which they appear on the command line. So, for the previous example, the contents of the String array passed to main in Echo are: arg[0] is "Greetings," arg[1] is "Planet."

The main() method of an application's initial class serves as the starting point for that application's initial thread. The initial thread can in turn fire off other threads.

Inside the Java virtual machine, threads come in two flavors: daemon and non- daemon. A daemon thread is ordinarily a thread used by the virtual machine itself, such as a thread that performs garbage collection. The application, however, can mark any threads it creates as daemon threads. The initial thread of an application--the one that begins at main()--is a non- daemon thread.

A Java application continues to execute (the virtual machine instance continues to live) as long as any non-daemon threads are still running. When all non-daemon threads of a Java application terminate, the virtual machine instance will exit. If permitted by the security manager, the application can also cause its own demise by invoking the exit() method of class Runtime or System.

In the Echo application previous, the main() method doesn't invoke any other threads. After it prints out the command line arguments, main() returns. This terminates the application's only non-daemon thread, which causes the virtual machine instance to exit.

The Architecture of the Java Virtual Machine

In the Java virtual machine specification, the behavior of a virtual machine instance is described in terms of subsystems, memory areas, data types, and instructions. These components describe an abstract inner architecture for the abstract Java virtual machine. The purpose of these components is not so much to dictate an inner architecture for implementations. It is more to provide a way to strictly define the external behavior of implementations. The specification defines the required behavior of any Java virtual machine implementation in terms of these abstract components and their interactions.

Figure 5-1 shows a block diagram of the Java virtual machine that includes the major subsystems and memory areas described in the specification. As mentioned in previous chapters, each Java virtual machine has a class loader subsystem: a mechanism for loading types (classes and interfaces) given fully qualified names. Each Java virtual machine also has an execution engine: a mechanism responsible for executing the instructions contained in the methods of loaded classes.

Figure 5-1. The internal architecture of the Java virtual machine.

When a Java virtual machine runs a program, it needs memory to store many things, including bytecodes and other information it extracts from loaded class files, objects the program instantiates, parameters to methods, return values, local variables, and intermediate results of computations. The Java virtual machine organizes the memory it needs to execute a program into several runtime data areas.

Although the same runtime data areas exist in some form in every Java virtual machine implementation, their specification is quite abstract. Many decisions about the structural details of the runtime data areas are left to the designers of individual implementations.

Different implementations of the virtual machine can have very different memory constraints. Some implementations may have a lot of memory in which to work, others may have very little. Some implementations may be able to take advantage of virtual memory, others may not. The abstract nature of the specification of the runtime data areas helps make it easier to implement the Java virtual machine on a wide variety of computers and devices.

Some runtime data areas are shared among all of an application's threads and others are unique to individual threads. Each instance of the Java virtual machine has one method area and one heap. These areas are shared by all threads running inside the virtual machine. When the virtual machine loads a class file, it parses information about a type from the binary data contained in the class file. It places this type information into the method area. As the program runs, the virtual machine places all objects the program instantiates onto the heap. See Figure 5-2 for a graphical depiction of these memory areas.

Figure 5-2. Runtime data areas shared among all threads.

As each new thread comes into existence, it gets its own pc register (program counter) and Java stack. If the thread is executing a Java method (not a native method), the value of the pc register indicates the next instruction to execute. A thread's Java stack stores the state of Java (not native) method invocations for the thread. The state of a Java method invocation includes its local variables, the parameters with which it was invoked, its return value (if any), and intermediate calculations. The state of native method invocations is stored in an implementation-dependent way in native method stacks, as well as possibly in registers or other implementation-dependent memory areas.

The Java stack is composed of stack frames (or frames). A stack frame contains the state of one Java method invocation. When a thread invokes a method, the Java virtual machine pushes a new frame onto that thread's Java stack. When the method completes, the virtual machine pops and discards the frame for that method.

The Java virtual machine has no registers to hold intermediate data values. The instruction set uses the Java stack for storage of intermediate data values. This approach was taken by Java's designers to keep the Java virtual machine's instruction set compact and to facilitate implementation on architectures with few or irregular general purpose registers. In addition, the stack-based architecture of the Java virtual machine's instruction set facilitates the code optimization work done by just-in-time and dynamic compilers that operate at run-time in some virtual machine implementations.

See Figure 5-3 for a graphical depiction of the memory areas the Java virtual machine creates for each thread. These areas are private to the owning thread. No thread can access the pc register or Java stack of another thread.

Figure 5-3. Runtime data areas exclusive to each thread.

Figure 5-3 shows a snapshot of a virtual machine instance in which three threads are executing. At the instant of the snapshot, threads one and two are executing Java methods. Thread three is executing a native method.

In Figure 5-3, as in all graphical depictions of the Java stack in this book, the stacks are shown growing downwards. The "top" of each stack is shown at the bottom of the figure. Stack frames for currently executing methods are shown in a lighter shade. For threads that are currently executing a Java method, the pc register indicates the next instruction to execute. In Figure 5-3, such pc registers (the ones for threads one and two) are shown in a lighter shade. Because thread three is currently executing a native method, the contents of its pc register--the one shown in dark gray--is undefined.

Data Types

The Java virtual machine computes by performing operations on certain types of data. Both the data types and operations are strictly defined by the Java virtual machine specification. The data types can be divided into a set of primitive types and a reference type. Variables of the primitive types hold primitive values, and variables of the reference type hold reference values. Reference values refer to objects, but are not objects themselves. Primitive values, by contrast, do not refer to anything. They are the actual data themselves. You can see a graphical depiction of the Java virtual machine's families of data types in Figure 5-4.

Figure 5-4. Data types of the Java virtual machine.

All the primitive types of the Java programming language are primitive types of the Java virtual machine. Although boolean qualifies as a primitive type of the Java virtual machine, the instruction set has very limited support for it. When a compiler translates Java source code into bytecodes, it uses ints or bytes to represent booleans. In the Java virtual machine, false is represented by integer zero and true by any non-zero integer. Operations involving boolean values use ints. Arrays of boolean are accessed as arrays of byte, though they may be represented on the heap as arrays of byte or as bit fields.

The primitive types of the Java programming language other than boolean form the numeric types of the Java virtual machine. The numeric types are divided between the integral types: byte, short, int, long, and char, and the floating- point types: float and double. As with the Java programming language, the primitive types of the Java virtual machine have the same range everywhere. A long in the Java virtual machine always acts like a 64-bit signed twos complement number, independent of the underlying host platform.

The Java virtual machine works with one other primitive type that is unavailable to the Java programmer: the returnAddress type. This primitive type is used to implement finally clauses of Java programs. The use of the returnAddress type is described in detail in Chapter 18, "Finally Clauses."

The reference type of the Java virtual machine is cleverly named reference. Values of type reference come in three flavors: the class type, the interface type, and the array type. All three types have values that are references to dynamically created objects. The class type's values are references to class instances. The array type's values are references to arrays, which are full-fledged objects in the Java virtual machine. The interface type's values are references to class instances that implement an interface. One other reference value is the null value, which indicates the reference variable doesn't refer to any object.

The Java virtual machine specification defines the range of values for each of the data types, but does not define their sizes. The number of bits used to store each data type value is a decision of the designers of individual implementations. The ranges of the Java virtual machines data type's are shown in Table 5-1. More information on the floating point ranges is given in Chapter 14, "Floating Point Arithmetic."

Type	Range
`byte`	8-bit signed two's complement integer (-2⁷ to 2⁷ - 1, inclusive)
`short`	16-bit signed two's complement integer (-2¹⁵ to 2¹⁵ - 1, inclusive)
`int`	32-bit signed two's complement integer (-2³¹ to 2³¹ - 1, inclusive)
`long`	64-bit signed two's complement integer (-2⁶³ to 2⁶³ - 1, inclusive)
`char`	16-bit unsigned Unicode character (0 to 2¹⁶ - 1, inclusive)
`float`	32-bit IEEE 754 single-precision float
`double`	64-bit IEEE 754 double-precision float
`returnAddress`	address of an opcode within the same method
`reference`	reference to an object on the heap, or `null`

Table 5-1. Ranges of the Java virtual machine's data types

Word Size

The basic unit of size for data values in the Java virtual machine is the word--a fixed size chosen by the designer of each Java virtual machine implementation. The word size must be large enough to hold a value of type byte, short, int, char, float, returnAddress, or reference. Two words must be large enough to hold a value of type long or double. An implementation designer must therefore choose a word size that is at least 32 bits, but otherwise can pick whatever word size will yield the most efficient implementation. The word size is often chosen to be the size of a native pointer on the host platform.

The specification of many of the Java virtual machine's runtime data areas are based upon this abstract concept of a word. For example, two sections of a Java stack frame--the local variables and operand stack-- are defined in terms of words. These areas can contain values of any of the virtual machine's data types. When placed into the local variables or operand stack, a value occupies either one or two words.

As they run, Java programs cannot determine the word size of their host virtual machine implementation. The word size does not affect the behavior of a program. It is only an internal attribute of a virtual machine implementation.

The Class Loader Subsystem

The part of a Java virtual machine implementation that takes care of finding and loading types is the class loader subsystem. Chapter 1, "Introduction to Java's Architecture," gives an overview of this subsystem. Chapter 3, "Security," shows how the subsystem fits into Java's security model. This chapter describes the class loader subsystem in more detail and show how it relates to the other components of the virtual machine's internal architecture.

As mentioned in Chapter 1, the Java virtual machine contains two kinds of class loaders: a bootstrap class loader and user-defined class loaders. The bootstrap class loader is a part of the virtual machine implementation, and user-defined class loaders are part of the running Java application. Classes loaded by different class loaders are placed into separate name spaces inside the Java virtual machine.

The class loader subsystem involves many other parts of the Java virtual machine and several classes from the java.lang library. For example, user-defined class loaders are regular Java objects whose class descends from java.lang.ClassLoader. The methods of class ClassLoader allow Java applications to access the virtual machine's class loading machinery. Also, for every type a Java virtual machine loads, it creates an instance of class java.lang.Class to represent that type. Like all objects, user-defined class loaders and instances of class Class reside on the heap. Data for loaded types resides in the method area.

Loading, Linking and Initialization

The class loader subsystem is responsible for more than just locating and importing the binary data for classes. It must also verify the correctness of imported classes, allocate and initialize memory for class variables, and assist in the resolution of symbolic references. These activities are performed in a strict order:

Loading: finding and importing the binary data for a type
Linking: performing verification, preparation, and (optionally) resolution
1. Verification: ensuring the correctness of the imported type
2. Preparation: allocating memory for class variables and initializing the memory to default values
3. Resolution: transforming symbolic references from the type into direct references.
Initialization: invoking Java code that initializes class variables to their proper starting values.

The details of these processes are given Chapter 7, "The Lifetime of a Type."

The Bootstrap Class Loader

Java virtual machine implementations must be able to recognize and load classes and interfaces stored in binary files that conform to the Java class file format. An implementation is free to recognize other binary forms besides class files, but it must recognize class files.

Every Java virtual machine implementation has a bootstrap class loader, which knows how to load trusted classes, including the classes of the Java API. The Java virtual machine specification doesn't define how the bootstrap loader should locate classes. That is another decision the specification leaves to implementation designers.

Given a fully qualified type name, the bootstrap class loader must in some way attempt to produce the data that defines the type. One common approach is demonstrated by the Java virtual machine implementation in Sun's 1.1 JDK on Windows98. This implementation searches a user-defined directory path stored in an environment variable named CLASSPATH. The bootstrap loader looks in each directory, in the order the directories appear in the CLASSPATH, until it finds a file with the appropriate name: the type's simple name plus ".class". Unless the type is part of the unnamed package, the bootstrap loader expects the file to be in a subdirectory of one the directories in the CLASSPATH. The path name of the subdirectory is built from the package name of the type. For example, if the bootstrap class loader is searching for class java.lang.Object, it will look for Object.class in the java\lang subdirectory of each CLASSPATH directory.

In 1.2, the bootstrap class loader of Sun's Java 2 SDK only looks in the directory in which the system classes (the class files of the Java API) were installed. The bootstrap class loader of the implementation of the Java virtual machine from Sun's Java 2 SDK does not look on the CLASSPATH. In Sun's Java 2 SDK virtual machine, searching the class path is the job of the system class loader, a user-defined class loader that is created automatically when the virtual machine starts up. More information on the class loading scheme of Sun's Java 2 SDK is given in Chapter 8, "The Linking Model."

User-Defined Class Loaders

Although user-defined class loaders themselves are part of the Java application, four of the methods in class ClassLoader are gateways into the Java virtual machine:

// Four of the methods declared in class java.lang.ClassLoader:
protected final Class defineClass(String name, byte data[],
    int offset, int length);
protected final Class defineClass(String name, byte data[],
    int offset, int length, ProtectionDomain protectionDomain);
protected final Class findSystemClass(String name);
protected final void resolveClass(Class c);

Any Java virtual machine implementation must take care to connect these methods of class ClassLoader to the internal class loader subsystem.

The two overloaded defineClass() methods accept a byte array, data[], as input. Starting at position offset in the array and continuing for length bytes, class ClassLoader expects binary data conforming to the Java class file format--binary data that represents a new type for the running application -- with the fully qualified name specified in name. The type is assigned to either a default protection domain, if the first version of defineClass() is used, or to the protection domain object referenced by the protectionDomain parameter. Every Java virtual machine implementation must make sure the defineClass() method of class ClassLoader can cause a new type to be imported into the method area.

The findSystemClass() method accepts a String representing a fully qualified name of a type. When a user-defined class loader invokes this method in version 1.0 and 1.1, it is requesting that the virtual machine attempt to load the named type via its bootstrap class loader. If the bootstrap class loader has already loaded or successfully loads the type, it returns a reference to the Class object representing the type. If it can't locate the binary data for the type, it throws ClassNotFoundException. In version 1.2, the findSystemClass() method attempts to load the requested type from the system class loader. Every Java virtual machine implementation must make sure the findSystemClass() method can invoke the bootstrap (if version 1.0 or 1.1) or system (if version 1.2 or later) class loader in this way.

The resolveClass() method accepts a reference to a Class instance. This method causes the type represented by the Class instance to be linked (if it hasn't already been linked). The defineClass() method, described previous, only takes care of loading. (See the previous section, "Loading, Linking, and Initialization" for definitions of these terms.) When defineClass() returns a Class instance, the binary file for the type has definitely been located and imported into the method area, but not necessarily linked and initialized. Java virtual machine implementations make sure the resolveClass() method of class ClassLoader can cause the class loader subsystem to perform linking.

The details of how a Java virtual machine performs class loading, linking, and initialization, with user- defined class loaders is given in Chapter 8, "The Linking Model."

Name Spaces

As mentioned in Chapter 3, "Security," each class loader maintains its own name space populated by the types it has loaded. Because each class loader has its own name space, a single Java application can load multiple types with the same fully qualified name. A type's fully qualified name, therefore, is not always enough to uniquely identify it inside a Java virtual machine instance. If multiple types of that same name have been loaded into different name spaces, the identity of the class loader that loaded the type (the identity of the name space it is in) will also be needed to uniquely identify that type.

Name spaces arise inside a Java virtual machine instance as a result of the process of resolution. As part of the data for each loaded type, the Java virtual machine keeps track of the class loader that imported the type. When the virtual machine needs to resolve a symbolic reference from one class to another, it requests the referenced class from the same class loader that loaded the referencing class. This process is described in detail in Chapter 8, "The Linking Model."

The Method Area

Inside a Java virtual machine instance, information about loaded types is stored in a logical area of memory called the method area. When the Java virtual machine loads a type, it uses a class loader to locate the appropriate class file. The class loader reads in the class file--a linear stream of binary data--and passes it to the virtual machine. The virtual machine extracts information about the type from the binary data and stores the information in the method area. Memory for class (static) variables declared in the class is also taken from the method area.

The manner in which a Java virtual machine implementation represents type information internally is a decision of the implementation designer. For example, multi-byte quantities in class files are stored in big- endian (most significant byte first) order. When the data is imported into the method area, however, a virtual machine can store the data in any manner. If an implementation sits on top of a little-endian processor, the designers may decide to store multi-byte values in the method area in little-endian order.

The virtual machine will search through and use the type information stored in the method area as it executes the application it is hosting. Designers must attempt to devise data structures that will facilitate speedy execution of the Java application, but must also think of compactness. If designing an implementation that will operate under low memory constraints, designers may decide to trade off some execution speed in favor of compactness. If designing an implementation that will run on a virtual memory system, on the other hand, designers may decide to store redundant information in the method area to facilitate execution speed. (If the underlying host doesn't offer virtual memory, but does offer a hard disk, designers could create their own virtual memory system as part of their implementation.) Designers can choose whatever data structures and organization they feel optimize their implementations performance, in the context of its requirements.

All threads share the same method area, so access to the method area's data structures must be designed to be thread-safe. If two threads are attempting to find a class named Lava, for example, and Lava has not yet been loaded, only one thread should be allowed to load it while the other one waits.

The size of the method area need not be fixed. As the Java application runs, the virtual machine can expand and contract the method area to fit the application's needs. Also, the memory of the method area need not be contiguous. It could be allocated on a heap--even on the virtual machine's own heap. Implementations may allow users or programmers to specify an initial size for the method area, as well as a maximum or minimum size.

The method area can also be garbage collected. Because Java programs can be dynamically extended via user-defined class loaders, classes can become "unreferenced" by the application. If a class becomes unreferenced, a Java virtual machine can unload the class (garbage collect it) to keep the memory occupied by the method area at a minimum. The unloading of classes--including the conditions under which a class can become "unreferenced"--is described in Chapter 7, "The Lifetime of a Type."

Type Information

For each type it loads, a Java virtual machine must store the following kinds of information in the method area:

The fully qualified name of the type
The fully qualified name of the type's direct superclass (unless the type is an interface or class java.lang.Object, neither of which have a superclass)
Whether or not the type is a class or an interface
The type's modifiers ( some subset of` public, abstract, final)
An ordered list of the fully qualified names of any direct superinterfaces

Inside the Java class file and Java virtual machine, type names are always stored as fully qualified names. In Java source code, a fully qualified name is the name of a type's package, plus a dot, plus the type's simple name. For example, the fully qualified name of class Object in package java.lang is java.lang.Object. In class files, the dots are replaced by slashes, as in java/lang/Object. In the method area, fully qualified names can be represented in whatever form and data structures a designer chooses.

In addition to the basic type information listed previously, the virtual machine must also store for each loaded type:

The constant pool for the type
Field information
Method information
All class (static) variables declared in the type, except constants
A reference to class ClassLoader
A reference to class Class

This data is described in the following sections.

The Constant Pool

For each type it loads, a Java virtual machine must store a constant pool. A constant pool is an ordered set of constants used by the type, including literals (string, integer, and floating point constants) and symbolic references to types, fields, and methods. Entries in the constant pool are referenced by index, much like the elements of an array. Because it holds symbolic references to all types, fields, and methods used by a type, the constant pool plays a central role in the dynamic linking of Java programs. The constant pool is described in more detail later in this chapter and in Chapter 6, "The Java Class File."

Field Information

For each field declared in the type, the following information must be stored in the method area. In addition to the information for each field, the order in which the fields are declared by the class or interface must also be recorded. Here's the list for fields:

The field's name
The field's type
The field's modifiers (some subset of public, private, protected, static, final, volatile, transient)

Method Information

For each method declared in the type, the following information must be stored in the method area. As with fields, the order in which the methods are declared by the class or interface must be recorded as well as the data. Here's the list:

The method's name
The method's return type (or void)
The number and types (in order) of the method's parameters
The method's modifiers (some subset of public, private, protected, static, final, synchronized, native, abstract)

In addition to the items listed previously, the following information must also be stored with each method that is not abstract or native:

The method's bytecodes
The sizes of the operand stack and local variables sections of the method's stack frame (these are described in a later section of this chapter)
An exception table (this is described in Chapter 17, "Exceptions")

Class Variables

Class variables are shared among all instances of a class and can be accessed even in the absence of any instance. These variables are associated with the class--not with instances of the class--so they are logically part of the class data in the method area. Before a Java virtual machine uses a class, it must allocate memory from the method area for each non-final class variable declared in the class.

Constants (class variables declared final) are not treated in the same way as non-final class variables. Every type that uses a final class variable gets a copy of the constant value in its own constant pool. As part of the constant pool, final class variables are stored in the method area--just like non-final class variables. But whereas non-final class variables are stored as part of the data for the type that declares them, final class variables are stored as part of the data for any type that uses them. This special treatment of constants is explained in more detail in Chapter 6, "The Java Class File."

A Reference to Class `ClassLoader`

For each type it loads, a Java virtual machine must keep track of whether or not the type was loaded via the bootstrap class loader or a user-defined class loader. For those types loaded via a user-defined class loader, the virtual machine must store a reference to the user-defined class loader that loaded the type. This information is stored as part of the type's data in the method area.

The virtual machine uses this information during dynamic linking. When one type refers to another type, the virtual machine requests the referenced type from the same class loader that loaded the referencing type. This process of dynamic linking is also central to the way the virtual machine forms separate name spaces. To be able to properly perform dynamic linking and maintain multiple name spaces, the virtual machine needs to know what class loader loaded each type in its method area. The details of dynamic linking and name spaces are given in Chapter 8, "The Linking Model."

A Reference to Class `Class`

An instance of class java.lang.Class is created by the Java virtual machine for every type it loads. The virtual machine must in some way associate a reference to the Class instance for a type with the type's data in the method area.

Your Java programs can obtain and use references to Class objects. One static method in class Class, allows you to get a reference to the Class instance for any loaded class:

// A method declared in class java.lang.Class:
public static Class forName(String className);

If you invoke forName("java.lang.Object"), for example, you will get a reference to the Class object that represents java.lang.Object. If you invoke forName("java.util.Enumeration"), you will get a reference to the Class object that represents the Enumeration interface from the java.util package. You can use forName() to get a Class reference for any loaded type from any package, so long as the type can be (or already has been) loaded into the current name space. If the virtual machine is unable to load the requested type into the current name space, forName() will throw ClassNotFoundException.

An alternative way to get a Class reference is to invoke getClass() on any object reference. This method is inherited by every object from class Object itself:

// A method declared in class java.lang.Object:
public final Class getClass();

If you have a reference to an object of class java.lang.Integer, for example, you could get the Class object for java.lang.Integer simply by invoking getClass() on your reference to the Integer object.

Given a reference to a Class object, you can find out information about the type by invoking methods declared in class Class. If you look at these methods, you will quickly realize that class Class gives the running application access to the information stored in the method area. Here are some of the methods declared in class Class:

// Some of the methods declared in class java.lang.Class:
public String getName();
public Class getSuperClass();
public boolean isInterface();
public Class[] getInterfaces();
public ClassLoader getClassLoader();

These methods just return information about a loaded type. getName() returns the fully qualified name of the type. getSuperClass() returns the Class instance for the type's direct superclass. If the type is class java.lang.Object or an interface, none of which have a superclass, getSuperClass() returns null. isInterface() returns true if the Class object describes an interface, false if it describes a class. getInterfaces() returns an array of Class objects, one for each direct superinterface. The superinterfaces appear in the array in the order they are declared as superinterfaces by the type. If the type has no direct superinterfaces, getInterfaces() returns an array of length zero. getClassLoader() returns a reference to the ClassLoader object that loaded this type, or null if the type was loaded by the bootstrap class loader. All this information comes straight out of the method area.

Method Tables

The type information stored in the method area must be organized to be quickly accessible. In addition to the raw type information listed previously, implementations may include other data structures that speed up access to the raw data. One example of such a data structure is a method table. For each non-abstract class a Java virtual machine loads, it could generate a method table and include it as part of the class information it stores in the method area. A method table is an array of direct references to all the instance methods that may be invoked on a class instance, including instance methods inherited from superclasses. (A method table isn't helpful in the case of abstract classes or interfaces, because the program will never instantiate these.) A method table allows a virtual machine to quickly locate an instance method invoked on an object. Method tables are described in detail in Chapter 8, "The Linking Model."

An Example of Method Area Use

As an example of how the Java virtual machine uses the information it stores in the method area, consider these classes:

// On CD-ROM in file jvm/ex2/Lava.java
class Lava {

    private int speed = 5; // 5 kilometers per hour

    void flow() {
    }
}

// On CD-ROM in file jvm/ex2/Volcano.java
class Volcano {

    public static void main(String[] args) {
        Lava lava = new Lava();
        lava.flow();
    }
}

The following paragraphs describe how an implementation might execute the first instruction in the bytecodes for the main() method of the Volcano application. Different implementations of the Java virtual machine can operate in very different ways. The following description illustrates one way--but not the only way--a Java virtual machine could execute the first instruction of Volcano's main() method.

To run the Volcano application, you give the name "Volcano" to a Java virtual machine in an implementation-dependent manner. Given the name Volcano, the virtual machine finds and reads in file Volcano.class. It extracts the definition of class Volcano from the binary data in the imported class file and places the information into the method area. The virtual machine then invokes the main() method, by interpreting the bytecodes stored in the method area. As the virtual machine executes main(), it maintains a pointer to the constant pool (a data structure in the method area) for the current class (class Volcano).

Note that this Java virtual machine has already begun to execute the bytecodes for main() in class Volcano even though it hasn't yet loaded class Lava. Like many (probably most) implementations of the Java virtual machine, this implementation doesn't wait until all classes used by the application are loaded before it begins executing main(). It loads classes only as it needs them.

main()'s first instruction tells the Java virtual machine to allocate enough memory for the class listed in constant pool entry one. The virtual machine uses its pointer into Volcano's constant pool to look up entry one and finds a symbolic reference to class Lava. It checks the method area to see if Lava has already been loaded.

The symbolic reference is just a string giving the class's fully qualified name: "Lava". Here you can see that the method area must be organized so a class can be located--as quickly as possible--given only the class's fully qualified name. Implementation designers can choose whatever algorithm and data structures best fit their needs--a hash table, a search tree, anything. This same mechanism can be used by the static forName() method of class Class, which returns a Class reference given a fully qualified name.

When the virtual machine discovers that it hasn't yet loaded a class named "Lava," it proceeds to find and read in file Lava.class. It extracts the definition of class Lava from the imported binary data and places the information into the method area.

The Java virtual machine then replaces the symbolic reference in Volcano's constant pool entry one, which is just the string "Lava", with a pointer to the class data for Lava. If the virtual machine ever has to use Volcano's constant pool entry one again, it won't have to go through the relatively slow process of searching through the method area for class Lava given only a symbolic reference, the string "Lava". It can just use the pointer to more quickly access the class data for Lava. This process of replacing symbolic references with direct references (in this case, a native pointer) is called constant pool resolution. The symbolic reference is resolved into a direct reference by searching through the method area until the referenced entity is found, loading new classes if necessary.

Finally, the virtual machine is ready to actually allocate memory for a new Lava object. Once again, the virtual machine consults the information stored in the method area. It uses the pointer (which was just put into Volcano's constant pool entry one) to the Lava data (which was just imported into the method area) to find out how much heap space is required by a Lava object.

A Java virtual machine can always determine the amount of memory required to represent an object by looking into the class data stored in the method area. The actual amount of heap space required by a particular object, however, is implementation-dependent. The internal representation of objects inside a Java virtual machine is another decision of implementation designers. Object representation is discussed in more detail later in this chapter.

Once the Java virtual machine has determined the amount of heap space required by a Lava object, it allocates that space on the heap and initializes the instance variable speed to zero, its default initial value. If class Lava's superclass, Object, has any instance variables, those are also initialized to default initial values. (The details of initialization of both classes and objects are given in Chapter 7, "The Lifetime of a Type.")

The first instruction of main() completes by pushing a reference to the new Lava object onto the stack. A later instruction will use the reference to invoke Java code that initializes the speed variable to its proper initial value, five. Another instruction will use the reference to invoke the flow() method on the referenced Lava object.

The Heap

Whenever a class instance or array is created in a running Java application, the memory for the new object is allocated from a single heap. As there is only one heap inside a Java virtual machine instance, all threads share it. Because a Java application runs inside its "own" exclusive Java virtual machine instance, there is a separate heap for every individual running application. There is no way two different Java applications could trample on each other's heap data. Two different threads of the same application, however, could trample on each other's heap data. This is why you must be concerned about proper synchronization of multi-threaded access to objects (heap data) in your Java programs.

The Java virtual machine has an instruction that allocates memory on the heap for a new object, but has no instruction for freeing that memory. Just as you can't explicitly free an object in Java source code, you can't explicitly free an object in Java bytecodes. The virtual machine itself is responsible for deciding whether and when to free memory occupied by objects that are no longer referenced by the running application. Usually, a Java virtual machine implementation uses a garbage collector to manage the heap.

Garbage Collection

A garbage collector's primary function is to automatically reclaim the memory used by objects that are no longer referenced by the running application. It may also move objects as the application runs to reduce heap fragmentation.

A garbage collector is not strictly required by the Java virtual machine specification. The specification only requires that an implementation manage its own heap in some manner. For example, an implementation could simply have a fixed amount of heap space available and throw an OutOfMemory exception when that space fills up. While this implementation may not win many prizes, it does qualify as a Java virtual machine. The Java virtual machine specification does not say how much memory an implementation must make available to running programs. It does not say how an implementation must manage its heap. It says to implementation designers only that the program will be allocating memory from the heap, but not freeing it. It is up to designers to figure out how they want to deal with that fact.

No garbage collection technique is dictated by the Java virtual machine specification. Designers can use whatever techniques seem most appropriate given their goals, constraints, and talents. Because references to objects can exist in many places--Java Stacks, the heap, the method area, native method stacks--the choice of garbage collection technique heavily influences the design of an implementation's runtime data areas. Various garbage collection techniques are described in Chapter 9, "Garbage Collection."

As with the method area, the memory that makes up the heap need not be contiguous, and may be expanded and contracted as the running program progresses. An implementation's method area could, in fact, be implemented on top of its heap. In other words, when a virtual machine needs memory for a freshly loaded class, it could take that memory from the same heap on which objects reside. The same garbage collector that frees memory occupied by unreferenced objects could take care of finding and freeing (unloading) unreferenced classes. Implementations may allow users or programmers to specify an initial size for the heap, as well as a maximum and minimum size.

Object Representation

The Java virtual machine specification is silent on how objects should be represented on the heap. Object representation--an integral aspect of the overall design of the heap and garbage collector--is a decision of implementation designers

The primary data that must in some way be represented for each object is the instance variables declared in the object's class and all its superclasses. Given an object reference, the virtual machine must be able to quickly locate the instance data for the object. In addition, there must be some way to access an object's class data (stored in the method area) given a reference to the object. For this reason, the memory allocated for an object usually includes some kind of pointer into the method area.

One possible heap design divides the heap into two parts: a handle pool and an object pool. An object reference is a native pointer to a handle pool entry. A handle pool entry has two components: a pointer to instance data in the object pool and a pointer to class data in the method area. The advantage of this scheme is that it makes it easy for the virtual machine to combat heap fragmentation. When the virtual machine moves an object in the object pool, it need only update one pointer with the object's new address: the relevant pointer in the handle pool. The disadvantage of this approach is that every access to an object's instance data requires dereferencing two pointers. This approach to object representation is shown graphically in Figure 5-5. This kind of heap is demonstrated interactively by the HeapOfFish applet, described in Chapter 9, "Garbage Collection."

Figure 5-5. Splitting an object across a handle pool and object pool.

Another design makes an object reference a native pointer to a bundle of data that contains the object's instance data and a pointer to the object's class data. This approach requires dereferencing only one pointer to access an object's instance data, but makes moving objects more complicated. When the virtual machine moves an object to combat fragmentation of this kind of heap, it must update every reference to that object anywhere in the runtime data areas. This approach to object representation is shown graphically in Figure 5-6.

Figure 5-6. Keeping object data all in one place.

The virtual machine needs to get from an object reference to that object's class data for several reasons. When a running program attempts to cast an object reference to another type, the virtual machine must check to see if the type being cast to is the actual class of the referenced object or one of its supertypes. . It must perform the same kind of check when a program performs an instanceof operation. In either case, the virtual machine must look into the class data of the referenced object. When a program invokes an instance method, the virtual machine must perform dynamic binding: it must choose the method to invoke based not on the type of the reference but on the class of the object. To do this, it must once again have access to the class data given only a reference to the object.

No matter what object representation an implementation uses, it is likely that a method table is close at hand for each object. Method tables, because they speed up the invocation of instance methods, can play an important role in achieving good overall performance for a virtual machine implementation. Method tables are not required by the Java virtual machine specification and may not exist in all implementations. Implementations that have extremely low memory requirements, for instance, may not be able to afford the extra memory space method tables occupy. If an implementation does use method tables, however, an object's method table will likely be quickly accessible given just a reference to the object.

One way an implementation could connect a method table to an object reference is shown graphically in Figure 5-7. This figure shows that the pointer kept with the instance data for each object points to a special structure. The special structure has two components:

A pointer to the full the class data for the object
The method table for the object The method table is an array of pointers to the data for each instance method that can be invoked on objects of that class. The method data pointed to by method table includes:
The sizes of the operand stack and local variables sections of the method's stack
The method's bytecodes
An exception table

This gives the virtual machine enough information to invoke the method. The method table include pointers to data for methods declared explicitly in the object's class or inherited from superclasses. In other words, the pointers in the method table may point to methods defined in the object's class or any of its superclasses. More information on method tables is given in Chapter 8, "The Linking Model."

Figure 5-7. Keeping the method table close at hand.

If you are familiar with the inner workings of C++, you may recognize the method table as similar to the VTBL or virtual table of C++ objects. In C++, objects are represented by their instance data plus an array of pointers to any virtual functions that can be invoked on the object. This approach could also be taken by a Java virtual machine implementation. An implementation could include a copy of the method table for a class as part of the heap image for every instance of that class. This approach would consume more heap space than the approach shown in Figure 5-7, but might yield slightly better performance on a systems that enjoy large quantities of available memory.

One other kind of data that is not shown in Figures 5-5 and 5-6, but which is logically part of an object's data on the heap, is the object's lock. Each object in a Java virtual machine is associated with a lock (or mutex) that a program can use to coordinate multi-threaded access to the object. Only one thread at a time can "own" an object's lock. While a particular thread owns a particular object's lock, only that thread can access that object's instance variables. All other threads that attempt to access the object's variables have to wait until the owning thread releases the object's lock. If a thread requests a lock that is already owned by another thread, the requesting thread has to wait until the owning thread releases the lock. Once a thread owns a lock, it can request the same lock again multiple times, but then has to release the lock the same number of times before it is made available to other threads. If a thread requests a lock three times, for example, that thread will continue to own the lock until it has released it three times.

Many objects will go through their entire lifetimes without ever being locked by a thread. The data required to implement an object's lock is not needed unless the lock is actually requested by a thread. As a result, many implementations, such as the ones shown in Figure 5-5 and 5-6, may not include a pointer to "lock data" within the object itself. Such implementations must create the necessary data to represent a lock when the lock is requested for the first time. In this scheme, the virtual machine must associate the lock with the object in some indirect way, such as by placing the lock data into a search tree based on the object's address.

Along with data that implements a lock, every Java object is logically associated with data that implements a wait set. Whereas locks help threads to work independently on shared data without interfering with one another, wait sets help threads to cooperate with one another--to work together towards a common goal.

Wait sets are used in conjunction with wait and notify methods. Every class inherits from Object three "wait methods" (overloaded forms of a method named wait()) and two "notify methods" (notify() and notifyAll()). When a thread invokes a wait method on an object, the Java virtual machine suspends that thread and adds it to that object's wait set. When a thread invokes a notify method on an object, the virtual machine will at some future time wake up one or more threads from that object's wait set. As with the data that implements an object's lock, the data that implements an object's wait set is not needed unless a wait or notify method is actually invoked on the object. As a result, many implementations of the Java virtual machine may keep the wait set data separate from the actual object data. Such implementations could allocate the data needed to represent an object's wait set when a wait or notify method is first invoked on that object by the running application. For more information about locks and wait sets, see Chapter 20, "Thread Synchronization."

One last example of a type of data that may be included as part of the image of an object on the heap is any data needed by the garbage collector. The garbage collector must in some way keep track of which objects are referenced by the program. This task invariably requires data to be kept for each object on the heap. The kind of data required depends upon the garbage collection technique being used. For example, if an implementation uses a mark and sweep algorithm, it must be able to mark an object as referenced or unreferenced. For each unreferenced object, it may also need to indicate whether or not the object's finalizer has been run. As with thread locks, this data may be kept separate from the object image. Some garbage collection techniques only require this extra data while the garbage collector is actually running. A mark and sweep algorithm, for instance, could potentially use a separate bitmap for marking referenced and unreferenced objects. More detail on various garbage collection techniques, and the data that is required by each of them, is given in Chapter 9, "Garbage Collection."

In addition to data that a garbage collector uses to distinguish between reference and unreferenced objects, a garbage collector needs data to keep track of which objects on which it has already executed a finalizer. Garbage collectors must run the finalizer of any object whose class declares one before it reclaims the memory occupied by that object. The Java language specification states that a garbage collector will only execute an object's finalizer once, but allows that finalizer to "resurrect" the object: to make the object referenced again. When the object becomes unreferenced for a second time, the garbage collector must not finalize it again. Because most objects will likely not have a finalizer, and very few of those will resurrect their objects, this scenario of garbage collecting the same object twice will probably be extremely rare. As a result, the data used to keep track of objects that have already been finalized, though logically part of the data associated with an object, will likely not be part of the object representation on the heap. In most cases, garbage collectors will keep this information in a separate place. Chapter 9, "Garbage Collection," gives more information about finalization.

Array Representation

In Java, arrays are full-fledged objects. Like objects, arrays are always stored on the heap. Also like objects, implementation designers can decide how they want to represent arrays on the heap.

Arrays have a Class instance associated with their class, just like any other object. All arrays of the same dimension and type have the same class. The length of an array (or the lengths of each dimension of a multidimensional array) does not play any role in establishing the array's class. For example, an array of three ints has the same class as an array of three hundred ints. The length of an array is considered part of its instance data.

The name of an array's class has one open square bracket for each dimension plus a letter or string representing the array's type. For example, the class name for an array of ints is "[I". The class name for a three-dimensional array of bytes is "[[[B". The class name for a two-dimensional array of Objects is "[[Ljava.lang.Object". The full details of this naming convention for array classes is given in Chapter 6, "The Java Class File."

Multi-dimensional arrays are represented as arrays of arrays. A two dimensional array of ints, for example, would be represented by a one dimensional array of references to several one dimensional arrays of ints. This is shown graphically in Figure 5-8.

Figure 5-8. One possible heap representation for arrays.

The data that must be kept on the heap for each array is the array's length, the array data, and some kind of reference to the array's class data. Given a reference to an array, the virtual machine must be able to determine the array's length, to get and set its elements by index (checking to make sure the array bounds are not exceeded), and to invoke any methods declared by Object, the direct superclass of all arrays.

The Program Counter

Each thread of a running program has its own pc register, or program counter, which is created when the thread is started. The pc register is one word in size, so it can hold both a native pointer and a returnAddress. As a thread executes a Java method, the pc register contains the address of the current instruction being executed by the thread. An "address" can be a native pointer or an offset from the beginning of a method's bytecodes. If a thread is executing a native method, the value of the pc register is undefined.

The Java Stack

When a new thread is launched, the Java virtual machine creates a new Java stack for the thread. As mentioned earlier, a Java stack stores a thread's state in discrete frames. The Java virtual machine only performs two operations directly on Java Stacks: it pushes and pops frames.

The method that is currently being executed by a thread is the thread's current method. The stack frame for the current method is the current frame. The class in which the current method is defined is called the current class, and the current class's constant pool is the current constant pool. As it executes a method, the Java virtual machine keeps track of the current class and current constant pool. When the virtual machine encounters instructions that operate on data stored in the stack frame, it performs those operations on the current frame.

When a thread invokes a Java method, the virtual machine creates and pushes a new frame onto the thread's Java stack. This new frame then becomes the current frame. As the method executes, it uses the frame to store parameters, local variables, intermediate computations, and other data.

A method can complete in either of two ways. If a method completes by returning, it is said to have normal completion. If it completes by throwing an exception, it is said to have abrupt completion. When a method completes, whether normally or abruptly, the Java virtual machine pops and discards the method's stack frame. The frame for the previous method then becomes the current frame.

All the data on a thread's Java stack is private to that thread. There is no way for a thread to access or alter the Java stack of another thread. Because of this, you need never worry about synchronizing multi- threaded access to local variables in your Java programs. When a thread invokes a method, the method's local variables are stored in a frame on the invoking thread's Java stack. Only one thread can ever access those local variables: the thread that invoked the method.

Like the method area and heap, the Java stack and stack frames need not be contiguous in memory. Frames could be allocated on a contiguous stack, or they could be allocated on a heap, or some combination of both. The actual data structures used to represent the Java stack and stack frames is a decision of implementation designers. Implementations may allow users or programmers to specify an initial size for Java stacks, as well as a maximum or minimum size.

The Stack Frame

The stack frame has three parts: local variables, operand stack, and frame data. The sizes of the local variables and operand stack, which are measured in words, depend upon the needs of each individual method. These sizes are determined at compile time and included in the class file data for each method. The size of the frame data is implementation dependent.

When the Java virtual machine invokes a Java method, it checks the class data to determine the number of words required by the method in the local variables and operand stack. It creates a stack frame of the proper size for the method and pushes it onto the Java stack.

Local Variables

The local variables section of the Java stack frame is organized as a zero-based array of words. Instructions that use a value from the local variables section provide an index into the zero-based array. Values of type int, float, reference, and returnAddress occupy one entry in the local variables array. Values of type byte, short, and char are converted to int before being stored into the local variables. Values of type long and double occupy two consecutive entries in the array.

To refer to a long or double in the local variables, instructions provide the index of the first of the two consecutive entries occupied by the value. For example, if a long occupies array entries three and four, instructions would refer to that long by index three. All values in the local variables are word-aligned. Dual-entry longs and doubles can start at any index.

The local variables section contains a method's parameters and local variables. Compilers place the parameters into the local variable array first, in the order in which they are declared. Figure 5-9 shows the local variables section for the following two methods:

// On CD-ROM in file jvm/ex3/Example3a.java
class Example3a {

    public static int runClassMethod(int i, long l, float f,
        double d, Object o, byte b) {

        return 0;
    }

    public int runInstanceMethod(char c, double d, short s,
        boolean b) {

        return 0;
    }
}

Figure 5-9. Method parameters on the local variables section of a Java stack.

Note that Figure 5-9 shows that the first parameter in the local variables for runInstanceMethod() is of type reference, even though no such parameter appears in the source code. This is the hidden this reference passed to every instance method. Instance methods use this reference to access the instance data of the object upon which they were invoked. As you can see by looking at the local variables for runClassMethod() in Figure 5-9, class methods do not receive a hidden this. Class methods are not invoked on objects. You can't directly access a class's instance variables from a class method, because there is no instance associated with the method invocation.

Note also that types byte, short, char, and boolean in the source code become ints in the local variables. This is also true of the operand stack. As mentioned earlier, the boolean type is not supported directly by the Java virtual machine. The Java compiler always uses ints to represent boolean values in the local variables or operand stack. Data types byte, short, and char, however, are supported directly by the Java virtual machine. These can be stored on the heap as instance variables or array elements, or in the method area as class variables. When placed into local variables or the operand stack, however, values of type byte, short, and char are converted into ints. They are manipulated as ints while on the stack frame, then converted back into byte, short, or char when stored back into heap or method area.

Also note that Object o is passed as a reference to runClassMethod(). In Java, all objects are passed by reference. As all objects are stored on the heap, you will never find an image of an object in the local variables or operand stack, only object references.

Aside from a method's parameters, which compilers must place into the local variables array first and in order of declaration, Java compilers can arrange the local variables array as they wish. Compilers can place the method's local variables into the array in any order, and they can use the same array entry for more than one local variable. For example, if two local variables have limited scopes that don't overlap, such as the i and j local variables in Example3b, compilers are free to use the same array entry for both variables. During the first half of the method, before j comes into scope, entry zero could be used for i. During the second half of the method, after i has gone out of scope, entry zero could be used for j.

// On CD-ROM in file jvm/ex3/Example3b.java
class Example3b {

    public static void runtwoLoops() {

        for (int i = 0; i < 10; ++i) {
            System.out.println(i);
        }

        for (int j = 9; j >= 0; --j) {
            System.out.println(j);
        }
    }
}

As with all the other runtime memory areas, implementation designers can use whatever data structures they deem most appropriate to represent the local variables. The Java virtual machine specification does not indicate how longs and doubles should be split across the two array entries they occupy. Implementations that use a word size of 64 bits could, for example, store the entire long or double in the lower of the two consecutive entries, leaving the higher entry unused.

Operand Stack

Like the local variables, the operand stack is organized as an array of words. But unlike the local variables, which are accessed via array indices, the operand stack is accessed by pushing and popping values. If an instruction pushes a value onto the operand stack, a later instruction can pop and use that value.

The virtual machine stores the same data types in the operand stack that it stores in the local variables: int, long, float, double, reference, and returnType. It converts values of type byte, short, and char to int before pushing them onto the operand stack.

Other than the program counter, which can't be directly accessed by instructions, the Java virtual machine has no registers. The Java virtual machine is stack-based rather than register-based because its instructions take their operands from the operand stack rather than from registers. Instructions can also take operands from other places, such as immediately following the opcode (the byte representing the instruction) in the bytecode stream, or from the constant pool. The Java virtual machine instruction set's main focus of attention, however, is the operand stack.

The Java virtual machine uses the operand stack as a work space. Many instructions pop values from the operand stack, operate on them, and push the result. For example, the iadd instruction adds two integers by popping two ints off the top of the operand stack, adding them, and pushing the int result. Here is how a Java virtual machine would add two local variables that contain ints and store the int result in a third local variable:

iload_0    // push the int in local variable 0
iload_1    // push the int in local variable 1
iadd       // pop two ints, add them, push result
istore_2   // pop int, store into local variable 2

In this sequence of bytecodes, the first two instructions, iload_0 and iload_1, push the ints stored in local variable positions zero and one onto the operand stack. The iadd instruction pops those two int values, adds them, and pushes the int result back onto the operand stack. The fourth instruction, istore_2, pops the result of the add off the top of the operand stack and stores it into local variable position two. In Figure 5-10, you can see a graphical depiction of the state of the local variables and operand stack while executing these instructions. In this figure, unused slots of the local variables and operand stack are left blank.

Figure 5-10. Adding two local variables.

Frame Data

In addition to the local variables and operand stack, the Java stack frame includes data to support constant pool resolution, normal method return, and exception dispatch. This data is stored in the frame data portion of the Java stack frame.

Many instructions in the Java virtual machine's instruction set refer to entries in the constant pool. Some instructions merely push constant values of type int, long, float, double, or String from the constant pool onto the operand stack. Some instructions use constant pool entries to refer to classes or arrays to instantiate, fields to access, or methods to invoke. Other instructions determine whether a particular object is a descendant of a particular class or interface specified by a constant pool entry.

Whenever the Java virtual machine encounters any of the instructions that refer to an entry in the constant pool, it uses the frame data's pointer to the constant pool to access that information. As mentioned earlier, references to types, fields, and methods in the constant pool are initially symbolic. When the virtual machine looks up a constant pool entry that refers to a class, interface, field, or method, that reference may still be symbolic. If so, the virtual machine must resolve the reference at that time.

Aside from constant pool resolution, the frame data must assist the virtual machine in processing a normal or abrupt method completion. If a method completes normally (by returning), the virtual machine must restore the stack frame of the invoking method. It must set the pc register to point to the instruction in the invoking method that follows the instruction that invoked the completing method. If the completing method returns a value, the virtual machine must push that value onto the operand stack of the invoking method.

The frame data must also contain some kind of reference to the method's exception table, which the virtual machine uses to process any exceptions thrown during the course of execution of the method. An exception table, which is described in detail in Chapter 17, "Exceptions," defines ranges within the bytecodes of a method that are protected by catch clauses. Each entry in an exception table gives a starting and ending position of the range protected by a catch clause, an index into the constant pool that gives the exception class being caught, and a starting position of the catch clause's code.

When a method throws an exception, the Java virtual machine uses the exception table referred to by the frame data to determine how to handle the exception. If the virtual machine finds a matching catch clause in the method's exception table, it transfers control to the beginning of that catch clause. If the virtual machine doesn't find a matching catch clause, the method completes abruptly. The virtual machine uses the information in the frame data to restore the invoking method's frame. It then rethrows the same exception in the context of the invoking method.

In addition to data to support constant pool resolution, normal method return, and exception dispatch, the stack frame may also include other information that is implementation dependent, such as data to support debugging.

Possible Implementations of the Java Stack

Implementation designers can represent the Java stack in whatever way they wish. As mentioned earlier, one potential way to implement the stack is by allocating each frame separately from a heap. As an example of this approach, consider the following class:

// On CD-ROM in file jvm/ex3/Example3c.java
class Example3c {

    public static void addAndPrint() {
        double result = addTwoTypes(1, 88.88);
        System.out.println(result);
    }

    public static double addTwoTypes(int i, double d) {
        return i + d;
    }
}

Figure 5-11 shows three snapshots of the Java stack for a thread that invokes the addAndPrint() method. In the implementation of the Java virtual machine represented in this figure, each frame is allocated separately from a heap. To invoke the addTwoTypes() method, the addAndPrint() method first pushes an int one and double 88.88 onto its operand stack. It then invokes the addTwoTypes() method.

Figure 5-11. Allocating frames from a heap.

The instruction to invoke addTwoTypes() refers to a constant pool entry. The Java virtual machine looks up the entry and resolves it if necessary.

Note that the addAndPrint() method uses the constant pool to identify the addTwoTypes() method, even though it is part of the same class. Like references to fields and methods of other classes, references to the fields and methods of the same class are initially symbolic and must be resolved before they are used.

The resolved constant pool entry points to information in the method area about the addTwoTypes() method. The virtual machine uses this information to determine the sizes required by addTwoTypes() for the local variables and operand stack. In the class file generated by Sun's javac compiler from the JDK 1.1, addTwoTypes() requires three words in the local variables and four words in the operand stack. (As mentioned earlier, the size of the frame data portion is implementation dependent.) The virtual machine allocates enough memory for the addTwoTypes() frame from a heap. It then pops the double and int parameters (88.88 and one) from addAndPrint()'s operand stack and places them into addTwoType()'s local variable slots one and zero.

When addTwoTypes() returns, it first pushes the double return value (in this case, 89.88) onto its operand stack. The virtual machine uses the information in the frame data to locate the stack frame of the invoking method, addAndPrint(). It pushes the double return value onto addAndPrint()'s operand stack and frees the memory occupied by addTwoType()'s frame. It makes addAndPrint()'s frame current and continues executing the addAndPrint() method at the first instruction past the addTwoType() method invocation.

Figure 5-12 shows snapshots of the Java stack of a different virtual machine implementation executing the same methods. Instead of allocating each frame separately from a heap, this implementation allocates frames from a contiguous stack. This approach allows the implementation to overlap the frames of adjacent methods. The portion of the invoking method's operand stack that contains the parameters to the invoked method become the base of the invoked method's local variables. In this example, addAndPrint()'s entire operand stack becomes addTwoType()'s entire local variables section.

Figure 5-12. Allocating frames from a contiguous stack.

This approach saves memory space because the same memory is used by the calling method to store the parameters as is used by the invoked method to access the parameters. It saves time because the Java virtual machine doesn't have to spend time copying the parameter values from one frame to another.

Note that the operand stack of the current frame is always at the "top" of the Java stack. Although this may be easier to visualize in the contiguous memory implementation of Figure 5-12, it is true no matter how the Java stack is implemented. (As mentioned earlier, in all the graphical images of the stack shown in this book, the stack grows downwards. The "top" of the stack is always shown at the bottom of the picture.) Instructions that push values onto (or pop values off of) the operand stack always operate on the current frame. Thus, pushing a value onto the operand stack can be seen as pushing a value onto the top of the entire Java stack. In the remainder of this book, "pushing a value onto the stack" refers to pushing a value onto the operand stack of the current frame.

One other possible approach to implementing the Java stack is a hybrid of the two approaches shown in Figure 5-11 and Figure 5-12. A Java virtual machine implementation can allocate a chunk of contiguous memory from a heap when a thread starts. In this memory, the virtual machine can use the overlapping frames approach shown in Figure 5-12. If the stack outgrows the contiguous memory, the virtual machine can allocate another chunk of contiguous memory from the heap. It can use the separate frames approach shown in Figure 5-11 to connect the invoking method's frame sitting in the old chunk with the invoked method's frame sitting in the new chunk. Within the new chunk, it can once again use the contiguous memory approach.

Native Method Stacks

In addition to all the runtime data areas defined by the Java virtual machine specification and described previously, a running Java application may use other data areas created by or for native methods. When a thread invokes a native method, it enters a new world in which the structures and security restrictions of the Java virtual machine no longer hamper its freedom. A native method can likely access the runtime data areas of the virtual machine (it depends upon the native method interface), but can also do anything else it wants. It may use registers inside the native processor, allocate memory on any number of native heaps, or use any kind of stack.

Native methods are inherently implementation dependent. Implementation designers are free to decide what mechanisms they will use to enable a Java application running on their implementation to invoke native methods.

Any native method interface will use some kind of native method stack. When a thread invokes a Java method, the virtual machine creates a new frame and pushes it onto the Java stack. When a thread invokes a native method, however, that thread leaves the Java stack behind. Instead of pushing a new frame onto the thread's Java stack, the Java virtual machine will simply dynamically link to and directly invoke the native method. One way to think of it is that the Java virtual machine is dynamically extending itself with native code. It is as if the Java virtual machine implementation is just calling another (dynamically linked) method within itself, at the behest of the running Java program.

If an implementation's native method interface uses a C-linkage model, then the native method stacks are C stacks. When a C program invokes a C function, the stack operates in a certain way. The arguments to the function are pushed onto the stack in a certain order. The return value is passed back to the invoking function in a certain way. This would be the behavior of the of native method stacks in that implementation.

A native method interface will likely (once again, it is up to the designers to decide) be able to call back into the Java virtual machine and invoke a Java method. In this case, the thread leaves the native method stack and enters another Java stack.

Figure 5-13 shows a graphical depiction of a thread that invokes a native method that calls back into the virtual machine to invoke another Java method. This figure shows the full picture of what a thread can expect inside the Java virtual machine. A thread may spend its entire lifetime executing Java methods, working with frames on its Java stack. Or, it may jump back and forth between the Java stack and native method stacks.

Figure 5-13. The stack for a thread that invokes Java and native methods.

As depicted in Figure 5-13, a thread first invoked two Java methods, the second of which invoked a native method. This act caused the virtual machine to use a native method stack. In this figure, the native method stack is shown as a finite amount of contiguous memory space. Assume it is a C stack. The stack area used by each C-linkage function is shown in gray and bounded by a dashed line. The first C-linkage function, which was invoked as a native method, invoked another C-linkage function. The second C-linkage function invoked a Java method through the native method interface. This Java method invoked another Java method, which is the current method shown in the figure.

As with the other runtime memory areas, the memory they occupied by native method stacks need not be of a fixed size. It can expand and contract as needed by the running application. Implementations may allow users or programmers to specify an initial size for the method area, as well as a maximum or minimum size.

Execution Engine

At the core of any Java virtual machine implementation is its execution engine. In the Java virtual machine specification, the behavior of the execution engine is defined in terms of an instruction set. For each instruction, the specification describes in detail what an implementation should do when it encounters the instruction as it executes bytecodes, but says very little about how. As mentioned in previous chapters, implementation designers are free to decide how their implementations will execute bytecodes. Their implementations can interpret, just-in-time compile, execute natively in silicon, use a combination of these, or dream up some brand new technique.

Similar to the three senses of the term "Java virtual machine" described at the beginning of this chapter, the term "execution engine" can also be used in any of three senses: an abstract specification, a concrete implementation, or a runtime instance. The abstract specification defines the behavior of an execution engine in terms of the instruction set. Concrete implementations, which may use a variety of techniques, are either software, hardware, or a combination of both. A runtime instance of an execution engine is a thread.

Each thread of a running Java application is a distinct instance of the virtual machine's execution engine. From the beginning of its lifetime to the end, a thread is either executing bytecodes or native methods. A thread may execute bytecodes directly, by interpreting or executing natively in silicon, or indirectly, by just- in-time compiling and executing the resulting native code. A Java virtual machine implementation may use other threads invisible to the running application, such as a thread that performs garbage collection. Such threads need not be "instances" of the implementation's execution engine. All threads that belong to the running application, however, are execution engines in action.

The Instruction Set

A method's bytecode stream is a sequence of instructions for the Java virtual machine. Each instruction consists of a one-byte opcode followed by zero or more operands. The opcode indicates the operation to be performed. Operands supply extra information needed by the Java virtual machine to perform the operation specified by the opcode. The opcode itself indicates whether or not it is followed by operands, and the form the operands (if any) take. Many Java virtual machine instructions take no operands, and therefore consist only of an opcode. Depending upon the opcode, the virtual machine may refer to data stored in other areas in addition to (or instead of) operands that trail the opcode. When it executes an instruction, the virtual machine may use entries in the current constant pool, entries in the current frame's local variables, or values sitting on the top of the current frame's operand stack.

The abstract execution engine runs by executing bytecodes one instruction at a time. This process takes place for each thread (execution engine instance) of the application running in the Java virtual machine. An execution engine fetches an opcode and, if that opcode has operands, fetches the operands. It executes the action requested by the opcode and its operands, then fetches another opcode. Execution of bytecodes continues until a thread completes either by returning from its starting method or by not catching a thrown exception.

From time to time, the execution engine may encounter an instruction that requests a native method invocation. On such occasions, the execution engine will dutifully attempt to invoke that native method. When the native method returns (if it completes normally, not by throwing an exception), the execution engine will continue executing the next instruction in the bytecode stream.

One way to think of native methods, therefore, is as programmer-customized extensions to the Java virtual machine's instruction set. If an instruction requests an invocation of a native method, the execution engine invokes the native method. Running the native method is how the Java virtual machine executes the instruction. When the native method returns, the virtual machine moves on to the next instruction. If the native method completes abruptly (by throwing an exception), the virtual machine follows the same steps to handle the exception as it does when any instruction throws an exception.

Part of the job of executing an instruction is determining the next instruction to execute. An execution engine determines the next opcode to fetch in one of three ways. For many instructions, the next opcode to execute directly follows the current opcode and its operands, if any, in the bytecode stream. For some instructions, such as goto or return, the execution engine determines the next opcode as part of its execution of the current instruction. If an instruction throws an exception, the execution engine determines the next opcode to fetch by searching for an appropriate catch clause.

Several instructions can throw exceptions. The athrow instruction, for example, throws an exception explicitly. This instruction is the compiled form of the throw statement in Java source code. Every time the athrow instruction is executed, it will throw an exception. Other instructions throw exceptions only when certain conditions are encountered. For example, if the Java virtual machine discovers, to its chagrin, that the program is attempting to perform an integer divide by zero, it will throw an ArithmeticException. This can occur while executing any of four instructions--idiv, ldiv, irem, and lrem--which perform divisions or calculate remainders on ints or longs.

Each type of opcode in the Java virtual machine's instruction set has a mnemonic. In the typical assembly language style, streams of Java bytecodes can be represented by their mnemonics followed by (optional) operand values.

For an example of method's bytecode stream and mnemonics, consider the doMathForever() method of this class:

// On CD-ROM in file jvm/ex4/Act.java
class Act {

    public static void doMathForever() {
        int i = 0;
        for (;;) {
            i += 1;
            i *= 2;
        }
    }
}

The stream of bytecodes for doMathForever() can be disassembled into mnemonics as shown next. The Java virtual machine specification does not define any official syntax for representing the mnemonics of a method's bytecodes. The code shown next illustrates the manner in which streams of bytecode mnemonics will be represented in this book. The left hand column shows the offset in bytes from the beginning of the method's bytecodes to the start of each instruction. The center column shows the instruction and any operands. The right hand column contains comments, which are preceded with a double slash, just as in Java source code.

// Bytecode stream: 03 3b 84 00 01 1a 05 68 3b a7 ff f9
// Disassembly:
// Method void doMathForever()
// Left column: offset of instruction from beginning of method
// |   Center column: instruction mnemonic and any operands
// |   |                   Right column: comment
   0   iconst_0           // 03
   1   istore_0           // 3b
   2   iinc 0, 1          // 84 00 01
   5   iload_0            // 1a
   6   iconst_2           // 05
   7   imul               // 68
   8   istore_0           // 3b
   9   goto 2             // a7 ff f9

This way of representing mnemonics is very similar to the output of the javap program of Sun's Java 2 SDK. javap allows you to look at the bytecode mnemonics of the methods of any class file. Note that jump addresses are given as offsets from the beginning of the method. The goto instruction causes the virtual machine to jump to the instruction at offset two (an iinc). The actual operand in the stream is minus seven. To execute this instruction, the virtual machine adds the operand to the current contents of the pc register. The result is the address of the iinc instruction at offset two. To make the mnemonics easier to read, the operands for jump instructions are shown as if the addition has already taken place. Instead of saying "goto -7," the mnemonics say, "goto 2."

The central focus of the Java virtual machine's instruction set is the operand stack. Values are generally pushed onto the operand stack before they are used. Although the Java virtual machine has no registers for storing arbitrary values, each method has a set of local variables. The instruction set treats the local variables, in effect, as a set of registers that are referred to by indexes. Nevertheless, other than the iinc instruction, which increments a local variable directly, values stored in the local variables must be moved to the operand stack before being used.

For example, to divide one local variable by another, the virtual machine must push both onto the stack, perform the division, and then store the result back into the local variables. To move the value of an array element or object field into a local variable, the virtual machine must first push the value onto the stack, then store it into the local variable. To set an array element or object field to a value stored in a local variable, the virtual machine must follow the reverse procedure. First, it must push the value of the local variable onto the stack, then pop it off the stack and into the array element or object field on the heap.

Several goals--some conflicting--guided the design of the Java virtual machine's instruction set. These goals are basically the same as those described in Part I of this book as the motivation behind Java's entire architecture: platform independence, network mobility, and security.

The platform independence goal was a major influence in the design of the instruction set. The instruction set's stack-centered approach, described previously, was chosen over a register-centered approach to facilitate efficient implementation on architectures with few or irregular registers, such as the Intel 80X86. This feature of the instruction set--the stack-centered design--make it easier to implement the Java virtual machine on a wide variety of host architectures.

Another motivation for Java's stack-centered instruction set is that compilers usually use a stack-based architecture to pass an intermediate compiled form or the compiled program to a linker/optimizer. The Java class file, which is in many ways similar to the UNIX .o or Windows .obj file emitted by a C compiler, really represents an intermediate compiled form of a Java program. In the case of Java, the virtual machine serves as (dynamic) linker and may serve as optimizer. The stack-centered architecture of the Java virtual machine's instruction set facilitates the optimization that may be performed at run-time in conjunction with execution engines that perform just-in-time compiling or adaptive optimization.

As mentioned in Chapter 4, "Network Mobility," one major design consideration was class file compactness. Compactness is important because it facilitates speedy transmission of class files across networks. In the bytecodes stored in class files, all instructions--except two that deal with table jumping--are aligned on byte boundaries. The total number of opcodes is small enough so that opcodes occupy only one byte. This design strategy favors class file compactness possibly at the cost of some performance when the program runs. In some Java virtual machine implementations, especially those executing bytecodes in silicon, the single-byte opcode may preclude certain optimizations that could improve performance. Also, better performance may have been possible on some implementations if the bytecode streams were word-aligned instead of byte-aligned. (An implementation could always realign bytecode streams, or translate opcodes into a more efficient form as classes are loaded. Bytecodes are byte-aligned in the class file and in the specification of the abstract method area and execution engine. Concrete implementations can store the loaded bytecode streams any way they wish.)

Another goal that guided the design of the instruction set was the ability to do bytecode verification, especially all at once by a data flow analyzer. The verification capability is needed as part of Java's security framework. The ability to use a data flow analyzer on the bytecodes when they are loaded, rather than verifying each instruction as it is executed, facilitates execution speed. One way this design goal manifests itself in the instruction set is that most opcodes indicate the type they operate on.

For example, instead of simply having one instruction that pops a word from the operand stack and stores it in a local variable, the Java virtual machine's instruction set has two. One instruction, istore, pops and stores an int. The other instruction, fstore, pops and stores a float. Both of these instructions perform the exact same function when executed: they pop a word and store it. Distinguishing between popping and storing an int versus a float is important only to the verification process.

For many instructions, the virtual machine needs to know the types being operated on to know how to perform the operation. For example, the Java virtual machine supports two ways of adding two words together, yielding a one-word result. One addition treats the words as ints, the other as floats. The difference between these two instructions facilitates verification, but also tells the virtual machine whether it should perform integer or floating point arithmetic.

A few instructions operate on any type. The dup instruction, for example, duplicates the top word of a stack irrespective of its type. Some instructions, such as goto, don't operate on typed values. The majority of the instructions, however, operate on a specific type. The mnemonics for most of these "typed" instructions indicate their type by a single character prefix that starts their mnemonic. Table 5-2 shows the prefixes for the various types. A few instructions, such as arraylength or instanceof, don't include a prefix because their type is obvious. The arraylength opcode requires an array reference. The instanceof opcode requires an object reference.

Type	Code	Example	Description
`byte`	`b`	`baload`	load `byte` from array
`short`	`s`	`saload`	load `short` from array
`int`	`i`	`iaload`	load `int` from array
`long`	`l`	`laload`	load `long` from array
`char`	`c`	`caload`	load `char` from array
`float`	`f`	`faload`	load `float` from array
`double`	`d`	`daload`	load `double` from array
`reference`	`a`	`aaload`	load `reference` from array

Table 5-2. Type prefixes of bytecode mnemonics

Values on the operand stack must be used in a manner appropriate to their type. It is illegal, for example, to push four ints, then add them as if they were two longs. It is illegal to push a float value onto the operand stack from the local variables, then store it as an int in an array on the heap. It is illegal to push a double value from an object field on the heap, then store the topmost of its two words into the local variables as an value of type reference. The strict type rules that are enforced by Java compilers must also be enforced by Java virtual machine implementations.

Implementations must also observe rules when executing instructions that perform generic stack operations independent of type. As mentioned previously, the dup instruction pushes a copy of the top word of the stack, irrespective of type. This instruction can be used on any value that occupies one word: an int, float, reference, or returnAddress. It is illegal, however, to use dup when the top of the stack contains either a long or double, the data types that occupy two consecutive operand stack locations. A long or double sitting on the top of the operand stack can be duplicated in their entirety by the dup2 instruction, which pushes a copy of the top two words onto the operand stack. The generic instructions cannot be used to split up dual-word values.

To keep the instruction set small enough to enable each opcode to be represented by a single byte, not all operations are supported on all types. Most operations are not supported for types byte, short, and char. These types are converted to int when moved from the heap or method area to the stack frame. They are operated on as ints, then converted back to byte, short, or char before being stored back into the heap or method area.

Table 5-3 shows the computation types that correspond to each storage type in the Java virtual machine. As used here, a storage type is the manner in which values of the type are represented on the heap. The storage type corresponds to the type of the variable in Java source code. A computation type is the manner in which the type is represented on the Java stack frame.

Storage Type	Minimum Bits in Heap or Method Area	Computation Type	Words in the Java Stack Frame
`byte`	8	`int`	1
`short`	16	`int`	1
`int`	32	`int`	1
`long`	64	`long`	2
`char`	16	`int`	1
`float`	32	`float`	1
`double`	64	`double`	2
`reference`	32	`reference`	1

Table 5-3. Storage and computation types inside the Java virtual machine

Implementations of the Java virtual machine must in some way ensure that values are operated on by instructions appropriate to their type. They can verify bytecodes up front as part of the class verification process, on the fly as the program executes, or some combination of both. Bytecode verification is described in more detail in Chapter 7, "The Lifetime of a Type." The entire instruction set is covered in detail in Chapters 10 through 20

Execution Techniques

Various execution techniques that may be used by an implementation--interpreting, just-in-time compiling, adaptive optimization, native execution in silicon--were described in Chapter 1, "Introduction to Java's Architecture." The main point to remember about execution techniques is that an implementation can use any technique to execute bytecodes so long as it adheres to the semantics of the Java virtual machine instruction set.

One of the most interesting -- and speedy -- execution techniques is adaptive optimization. The adaptive optimization technique, which is used by several existing Java virtual machine implementations, including Sun's Hotspot virtual machine, borrows from techniques used by earlier virtual machine implementations. The original JVMs interpreted bytecodes one at a time. Second-generation JVMs added a JIT compiler, which compiles each method to native code upon first execution, then executes the native code. Thereafter, whenever the method is called, the native code is executed. Adaptive optimizers, taking advantage of information available only at run-time, attempt to combine bytecode interpretation and compilation to native in the way that will yield optimum performance.

An adaptive optimizing virtual machine begins by interpreting all code, but it monitors the execution of that code. Most programs spend 80 to 90 percent of their time executing 10 to 20 percent of the code. By monitoring the program execution, the virtual machine can figure out which methods represent the program's "hot spot" -- the 10 to 20 percent of the code that is executed 80 to 90 percent of the time.

When the adaptive optimizing virtual machine decides that a particular method is in the hot spot, it fires off a background thread that compiles those bytecodes to native and heavily optimizes the native code. Meanwhile, the program can still execute that method by interpreting its bytecodes. Because the program isn't held up and because the virtual machine is only compiling and optimizing the "hot spot" (perhaps 10 to 20 percent of the code), the virtual machine has more time than a traditional JIT to perform optimizations.

The adaptive optimization approach yields a program in which the code that is executed 80 to 90 percent of the time is native code as heavily optimized as statically compiled C++, with a memory footprint not much bigger than a fully interpreted Java program. In other words, fast. An adaptive optimizing virtual machine can keep the old bytecodes around in case a method moves out of the hot spot. (The hot spot may move somewhat as the program executes.) If a method moves out of the hot spot, the virtual machine can discard the compiled code and revert back to interpreting that method's bytecodes.

As you may have noticed, an adaptive optimizer's approach to making Java programs run fast is similar to the approach programmers should take to improve a program's performance. An adaptive optimizing virtual machine, unlike a regular JIT compiling virtual machine, doesn't do "premature optimization." The adaptive optimizing virtual machine begins by interpreting bytecodes. As the program runs, the virtual machine "profiles" the program to find the program's "hot spot," that 10 to 20 percent of the code that gets executed 80 to 90 percent of the time. And like a good programmer, the adaptive optimizing virtual machine just focuses its optimization efforts on that time-critical code.

But there is a bit more to the adaptive optimization story. Adaptive optimizers can be tuned for the run- time characteristics of Java programs -- in particular, of "well- designed" Java programs. According to David Griswold, Hotspot manager at JavaSoft, "Java is a lot more object-oriented than C++. You can measure that; you can look at the rates of method invocations, dynamic dispatches, and such things. And the rates [for Java] are much higher than they are in C++." Now this high rate of method invocations and dynamic dispatches is especially true in a well-designed Java program, because one aspect of a well-designed Java program is highly factored, fine-grained design -- in other words, lots of compact, cohesive methods and compact, cohesive objects.

This run-time characteristic of Java programs, the high frequency of method invocations and dynamic dispatches, affects performance in two ways. First, there is an overhead associated with each dynamic dispatch. Second, and more significantly, method invocations reduce the effectiveness of compiler optimization.

Method invocations reduce the effectiveness of optimizers because optimizers don't perform well across method invocation boundaries. As a result, optimizers end up focusing on the code between method invocations. And the greater the method invocation frequency, the less code the optimizer has to work with between method invocations, and the less effective the optimization becomes.

The standard solution to this problem is inlining -- the copying of an invoked method's body directly into the body of the invoking method. Inlining eliminates method calls and gives the optimizer more code to work with. It makes possible more effective optimization at the cost of increasing the run- time memory footprint of the program.

The trouble is that inlining is harder with object-oriented languages, such as Java and C++, than with non-object-oriented languages, such as C, because object-oriented languages use dynamic dispatching. And the problem is worse in Java than in C++, because Java has a greater call frequency and a greater percentage of dynamic dispatches than C++.

A regular optimizing static compiler for a C program can inline straightforwardly because there is one function implementation for each function call. The trouble with doing inlining with object- oriented languages is that dynamic method dispatch means there may be multiple function (or method) implementation for any given function call. In other words, the JVM may have many different implementations of a method to choose from at run time, based on the class of the object on which the method is being invoked.

One solution to the problem of inlining a dynamically dispatched method call is to just inline all of the method implementations that may get selected at run-time. The trouble with this solution is that in cases where there are a lot of method implementations, the size of the optimized code can grow very large.

One advantage adaptive optimization has over static compilation is that, because it is happening at runtime, it can use information not available to a static compiler. For example, even though there may be 30 possible implementations that may get called for a particular method invocation, at run-time perhaps only two of them are ever called. The adaptive optimization approach enables only those two to be inlined, thereby minimizing the size of the optimized code.

Threads

The Java virtual machine specification defines a threading model that aims to facilitate implementation on a wide variety of architectures. One goal of the Java threading model is to enable implementation designers, where possible and appropriate, to use native threads. Alternatively, designers can implement a thread mechanism as part of their virtual machine implementation. One advantage to using native threads on a multi-processor host is that different threads of a Java application could run simultaneously on different processors.

One tradeoff of Java's threading model is that the specification of priorities is lowest-common- denominator. A Java thread can run at any one of ten priorities. Priority one is the lowest, and priority ten is the highest. If designers use native threads, they can map the ten Java priorities onto the native priorities however seems most appropriate. The Java virtual machine specification defines the behavior of threads at different priorities only by saying that all threads at the highest priority will get some CPU time. Threads at lower priorities are guaranteed to get CPU time only when all higher priority threads are blocked. Lower priority threads may get some CPU time when higher priority threads aren't blocked, but there are no guarantees.

The specification doesn't assume time-slicing between threads of different priorities, because not all architectures time-slice. (As used here, time-slicing means that all threads at all priorities will be guaranteed some CPU time, even when no threads are blocked.) Even among those architectures that do time-slice, the algorithms used to allot time slots to threads at various priorities can differ greatly.

As mentioned in Chapter 2, "Platform Independence," you must not rely on time-slicing for program correctness. You should use thread priorities only to give the Java virtual machine hints at what it should spend more time on. To coordinate the activities of multiple threads, you should use synchronization.

The thread implementation of any Java virtual machine must support two aspects of synchronization: object locking and thread wait and notify. Object locking helps keep threads from interfering with one another while working independently on shared data. Thread wait and notify helps threads to cooperate with one another while working together toward some common goal. Running applications access the Java virtual machine's locking capabilities via the instruction set, and its wait and notify capabilities via the wait(), notify(), and notifyAll() methods of class Object. For more details, see Chapter 20, "Thread Synchronization."

In the Java virtual machine Specification, the behavior of Java threads is defined in terms of variables, a main memory, and working memories. Each Java virtual machine instance has a main memory, which contains all the program's variables: instance variables of objects, components of arrays, and class variables. Each thread has a working memory, in which the thread stores "working copies" of variables it uses or assigns. Local variables and parameters, because they are private to individual threads, can be logically seen as part of either the working memory or main memory.

The Java virtual machine specification defines many rules that govern the low-level interactions of threads with main memory. For example, one rule states that all operations on primitive types, except in some cases longs and doubles, are atomic. For example, if two threads compete to write two different values to an int variable, even in the absence of synchronization, the variable will end up with one value or the other. The variable will not contain a corrupted value. In other words, one thread will win the competition and write its value to the variable first. The losing thread need not sulk, however, because it will write its value the variable second, overwriting the "winning" thread's value.

The exception to this rule is any long or double variable that is not declared volatile. Rather than being treated as a single atomic 64-bit value, such variables may be treated by some implementations as two atomic 32-bit values. Storing a non-volatile long to memory, for example, could involve two 32-bit write operations. This non- atomic treatment of longs and doubles means that two threads competing to write two different values to a long or double variable can legally yield a corrupted result.

Although implementation designers are not required to treat operations involving non-volatile longs and doubles atomically, the Java virtual machine specification encourages them to do so anyway. This non-atomic treatment of longs and doubles is an exception to the general rule that operations on primitive types are atomic. This exception is intended to facilitate efficient implementation of the threading model on processors that don't provide efficient ways to transfer 64-bit values to and from memory. In the future, this exception may be eliminated. For the time being, however, Java programmers must be sure to synchronize access to shared longs and doubles.

Fundamentally, the rules governing low-level thread behavior specify when a thread may and when it must:

copy values of variables from the main memory to its working memory, and
write values from its working memory back into the main memory.

For certain conditions, the rules specify a precise and predictable order of memory reads and writes. For other conditions, however, the rules do not specify any order. The rules are designed to enable Java programmers to build multi-threaded programs that exhibit predictable behavior, while giving implementation designers some flexibility. This flexibility enables designers of Java virtual machine implementations to take advantage of standard hardware and software techniques that can improve the performance of multi-threaded applications.

The fundamental high-level implication of all the low-level rules that govern the behavior of threads is this: If access to certain variables isn't synchronized, threads are allowed update those variables in main memory in any order. Without synchronization, your multi-threaded applications may exhibit surprising behavior on some Java virtual machine implementations. With proper use of synchronization, however, you can create multi-threaded Java applications that behave in a predictable way on any implementation of the Java virtual machine.

Native Method Interface

Java virtual machine implementations aren't required to support any particular native method interface. Some implementations may support no native method interfaces at all. Others may support several, each geared towards a different purpose.

Sun's Java Native Interface, or JNI, is geared towards portability. JNI is designed so it can be supported by any implementation of the Java virtual machine, no matter what garbage collection technique or object representation the implementation uses. This in turn enables developers to link the same (JNI compatible) native method binaries to any JNI-supporting virtual machine implementation on a particular host platform.

Implementation designers can choose to create proprietary native method interfaces in addition to, or instead of, JNI. To achieve its portability, the JNI uses a lot of indirection through pointers to pointers and pointers to functions. To obtain the ultimate in performance, designers of an implementation may decide to offer their own low-level native method interface that is tied closely to the structure of their particular implementation. Designers could also decide to offer a higher-level native method interface than JNI, such as one that brings Java objects into a component software model.

To do useful work, a native method must be able to interact to some degree with the internal state of the Java virtual machine instance. For example, a native method interface may allow native methods to do some or all of the following:

Pass and return data
Access instance variables or invoke methods in objects on the garbage-collected heap
Access class variables or invoke class methods
Accessing arrays
Lock an object on the heap for exclusive use by the current thread
Create new objects on the garbage-collected heap
Load new classes
Throw new exceptions
Catch exceptions thrown by Java methods that the native method invoked
Catch asynchronous exceptions thrown by the virtual machine
Indicate to the garbage collector that it no longer needs to use a particular object

Designing a native method interface that offers these services can be complicated. The design needs to ensure that the garbage collector doesn't free any objects that are being used by native methods. If an implementation's garbage collector moves objects to keep heap fragmentation at a minimum, the native method interface design must make sure that either:

an object can be moved after its reference has been passed to a native method, or
any objects whose references have been passed to a native method are pinned until the native method returns or otherwise indicates it is done with the objects As you can see, native method interfaces are very intertwined with the inner workings of a Java virtual machine.

The Real Machine

As mentioned at the beginning of this chapter, all the subsystems, runtime data areas, and internal behaviors defined by the Java virtual machine specification are abstract. Designers aren't required to organize their implementations around "real" components that map closely to the abstract components of the specification. The abstract internal components and behaviors are merely a vocabulary with which the specification defines the required external behavior of any Java virtual machine implementation.

In other words, an implementation can be anything on the inside, so long as it behaves like a Java virtual machine on the outside. Implementations must be able to recognize Java class files and must adhere to the semantics of the Java code the class files contain. But otherwise, anything goes. How bytecodes are executed, how the runtime data areas are organized, how garbage collection is accomplished, how threads are implemented, how the bootstrap class loader finds classes, what native method interfaces are supported-- these are some of the many decisions left to implementation designers.

The flexibility of the specification gives designers the freedom to tailor their implementations to fit their circumstances. In some implementations, minimizing usage of resources may be critical. In other implementations, where resources are plentiful, maximizing performance may be the one and only goal.

By clearly marking the line between the external behavior and the internal implementation of a Java virtual machine, the specification preserves compatibility among all implementations while promoting innovation. Designers are encouraged to apply their talents and creativity towards building ever-better Java virtual machines.

Eternal Math: A Simulation

The CD-ROM contains several simulation applets that serve as interactive illustrations for the material presented in this book. The applet shown in Figure 5-14 simulates a Java virtual machine executing a few bytecodes. You can run this applet by loading applets/EternalMath.html from the CD-ROM into any Java enabled web browser or applet viewer that supports JDK 1.0.

The instructions in the simulation represent the body of the doMathForever() method of class Act, shown previously in the "Instruction Set" section of this chapter. This simulation shows the local variables and operand stack of the current frame, the pc register, and the bytecodes in the method area. It also shows an optop register, which you can think of as part of the frame data of this particular implementation of the Java virtual machine. The optop register always points to one word beyond the top of the operand stack.

The applet has four buttons: Step, Reset, Run, and Stop. Each time you press the Step button, the Java virtual machine simulator will execute the instruction pointed to by the pc register. Initially, the pc register points to an iconst_0 instruction. The first time you press the Step button, therefore, the virtual machine will execute iconst_0. It will push a zero onto the stack and set the pc register to point to the next instruction to execute. Subsequent presses of the Step button will execute subsequent instructions and the pc register will lead the way. If you press the Run button, the simulation will continue with no further coaxing on your part until you press the Stop button. To start the simulation over, press the Reset button.

The value of each register (pc and optop) is shown two ways. The contents of each register, an integer offset from the beginning of either the method's bytecodes or the operand stack, is shown in an edit box. Also, a small arrow (either "pc>" or "optop>") indicates the location contained in the register.

In the simulation the operand stack is shown growing down the panel (up in memory offsets) as words are pushed onto it. The top of the stack recedes back up the panel as words are popped from it.

The doMathForever() method has only one local variable, i, which sits at array position zero. The first two instructions, iconst_0 and istore_0 initialize the local variable to zero. The next instruction, iinc, increments i by one. This instruction implements the i += 1 statement from doMathForever(). The next instruction, iload_0, pushes the value of the local variable onto the operand stack. iconst_2 pushes an int 2 onto the operand stack. imul pops the top two ints from the operand stack, multiplies them, and pushes the result. The istore_0 instruction pops the result of the multiply and puts it into the local variable. The previous four instructions implement the i *= 2 statement from doMathForever(). The last instruction, goto, sends the program counter back to the iinc instruction. The goto implements the for (;;) loop of doMathForever().

With enough patience and clicks of the Step button (or a long enough run of the Run button), you can get an arithmetic overflow. When the Java virtual machine encounters such a condition, it just truncates, as is shown by this simulation. It does not throw any exceptions.

For each step of the simulation, a panel at the bottom of the applet contains an explanation of what the next instruction will do. Happy clicking.

Figure 5-14. The Eternal Math applet.

On the CD-ROM

The CD-ROM contains the source code examples from this chapter in the jvm directory. The Eternal Math applet is contained in a web page on the CD-ROM in file applets/EternalMath.html. The source code for this applet is found alongside its class files, in the applets/JVMSimulators and applets/JVMSimulators/COM/artima/jvmsim directories.

The Resources Page

For links to more information about the Java virtual machine, visit the resources page: http://www.artima.com/insidejvm/resources/

이 글에는 트랙백을 보낼 수 없습니다

Application_developing 2007/07/22 15:01

UML의 기초: Unified Modeling Language 소개

20세기 말을 되돌아보며 --

정확히 1997년 Object Management Group (OMG)이 Unified Modeling Language (UML)을 발표했다. UML의 목표 중 하나는 개발 커뮤니티에 안정적이고, 일반적인 디자인 언어를 제공하는 것이었다. UML은 IT 전문가들이 수년 동안 바라던 통합된 표준 모델링 표기법을 탄생시켰다. UML을 사용하여 IT 전문가들은 시스템 구조와 디자인 계획을 읽고 분산시킬 수 있다. 건축가들이 빌딩의 청사진을 사용하는 것처럼 말이다.

이제 21세기가 되었고 UML은 전문성을 확립하게 되었다. 내가 보고 있는 이력서의 75 퍼센트 정도가 UML을 알고 있다고 쓰여있다. 하지만 면접을 통해 이야기를 해보면 이들이 진정으로 UML을 알지 못하고 있다는 것이 명확해진다. 일반적으로 당시 이슈가 되는 키워드 로서 알고 있거나 표면적인 면만 알고 있는 경우가 대부분이었다. 이것이 바로 내가 이 글을 쓴 이유이다. 이 글을 다 읽었다고 해서 이력서에 UML을 충분히 알고 있다고 쓸 수는 없겠지만, 이 언어를 보다 심도 깊게 연구할 출발선에는 설 정도는 된 것이다.

배경 지식

UML은 컴퓨터 애플리케이션을 모델링 할 수 있는 통합 언어이다. 주요 작성자들은 Jim Rumbaugh, Ivar Jacobson, Grady Booch이고 이들은 원래 그들만의 꽤 괜찮은 방식(OMT, OOSE, Booch)을 갖고 있었다. 결국 이들은 힘을 합쳤고 개방형 표준을 만들었다. (어디서 많이 들어본 소리인가? J2EE, SOAP, Linux도 비슷한 현상이다.) UML이 표준 모델링 언어가 된 한 가지 이유는 이것이 프로그래밍 언어에 독립적이라는데 있다. (IBM Rational의 UML 모델링 툴은 .NET 뿐만 아니라 J2EE에서도 사용된다.) 또한 UML 표기법 세트는 언어이지 방법론이 아니다. 언어인 것이 중요한 이유는 방법론과는 반대로 언어는 기업이 비즈니스를 수행하는 방식에 잘 맞는다.

UML은 방법론이 아니기 때문에 (IBM Rational Unified Process® lingo의 "객체(artifacts)" 같은) 어떤 형식적인 작업 생성물들이 필요 없다. 하지만 정해진 방법론 안에서 쓰이면, 애플리케이션을 개발할 때 애플리케이션을 쉽게 이해할 수 있도록 도와주는 여러 가지 유형의 다이어그램을 제공한다. 이 다이어그램은 현재 사용하고 있는 것의 언어와 원리를 잘 소개하고 있다. 사용중인 방법론에서 생긴 작업 생산품들에 표준 UML 다이어그램을 배치하여 UML에 능숙한 사람들이 프로젝트에 쉽게 참여하여 생산성을 높일 수 있도록 한다. 가장 유용한 표준 UML 다이어그램은 사용 케이스 다이어그램, 클래스 다이어그램, 시퀀스 다이어그램, 스테이트 차트 다이어그램, 액티비티 다이어그램, 컴포넌트 다이어그램, 전개 다이어그램 등이 있다.

각 유형의 다이어그램을 자세히 설명하지는 않겠다. 대신, 각 다이어그램에 대한 일반적인 개념을 소개하고 자세한 것은 나중에 다루도록 하겠다.

위로

사용 케이스 다이어그램

사용 케이스는 시스템에서 제공한 기능 단위를 설명한다. 사용 케이스 다이어그램의 주요 목적은, 다른 사용 케이스들 간 관계 뿐만 아니라 주요 프로세스에 대한 "액터(actors)" (시스템과 인터랙팅하는 사람)들과의 관계를 포함하여, 개발 팀들이 시스템의 기능적 요구 사항들을 시각화 하는 데 있다. 사용 케이스 다이어그램은 사용 케이스 그룹들을 보여준다. 완전한 시스템에 대한 모든 사용 케이스이거나 관련된 기능을 가진 특정 사용 케이스 그룹(예를 들어, 보안 관리에 관련된 사용 케이스 그룹)의 사용 케이스일 수도 있다. 사용 케이스 다이어그램에 대한 사용 케이스를 보여주려면 다이어그램 중간에 타원을 그려서, 타원의 중앙 또는 아래에 사용 케이스 이름을 적어놓는다. 사용 케이스 다이어그램에 액터(시스템 사용자)를 그리려면 다이어그램의 왼쪽이나 오른쪽에 사람 모양을 그려 넣는다. (얼마나 예쁘게 그리는가는 여러분에게 달려있다.) 액터와 사용 케이스들간 관계는 그림 1에 나타나있다.

그림 1: 사용 케이스 다이어그램

사용 케이스 다이어그램은 시스템의 고급 기능과 시스템의 범위를 설명하는데 사용된다. 그림 1의 사용 케이스 다이어그램을 통해, 시스템이 제공하는 기능을 쉽게 표현할 수 있다. 이러한 시스템에서는 밴드 매니저가 밴드가 발매한 CD에 대한 판매 통계 리포트와 Billboard 200 보고서를 볼 수 있다. 또한 레코드 매니저는 특정 CD에 대한 판매 통계 보고서와 Billboard 200 보고서를 볼 수 있다. 이 다이어그램에서는 Billboard Reporting Service라고 하는 외부 시스템에서 우리 시스템이 Billboard 리포트를 전달하고 있다는 것도 볼 수 있다.

또한, 이 다이어그램에 사용 케이스가 없다는 것은 시스템이 수행하지 않은 일을 보여주고 있는 것이다. 예를 들어, 이 다이어그램은 밴드 매니저가 Billboard 200의 다른 앨범들에 수록된 노래를 듣는 방식은 나와있지 않다. Billboard 200에서 Listen to Songs 라는 사용 케이스에 대한 어떤 레퍼런스도 볼 수 없다. 이것은 중요한 포인트이다. 그와 같은 다이어그램에 제공된 명확하고 간단한 사용 케이스 설명을 통해 프로젝트 스폰서는 시스템에 필요한 기능이 존재하는지 여부를 쉽게 볼 수 있는 것이다.

위로

클래스 다이어그램

클래스 다이어그램은 다른 엔터티들(사람, 제품, 데이터)이 서로 어떻게 관계를 맺고 있는지를 보여준다. 다시 말해서, 이것은 시스템의 정적 구조라고 할 수 있다. 클래스 다이어그램은 록밴드, 씨디, 라디오 연주를 논리적 클래스로 나타내는데 사용될 수 있다. 또는 대출, 주택 저당 대출, 자동차 대출, 이자율을 나타내는데도 쓰일 수 있겠다. 클래스 다이어그램은 주로 프로그래머들이 다루는 구현 클래스들을 보여주는데 쓰인다. 구현 클래스 다이어그램은 논리적 클래스 다이어그램과 같은 클래스를 보여준다. 하지만 이 구현 클래스 다이어그램은 같은 애트리뷰트로는 그릴 수 없다. Vectors와 HashMaps 같은 것에 대한 레퍼런스를 갖고 있기 때문이다.

그림 2에서는 세 개의 섹션으로 클래스 다이어그램을 설명하고 있다. 위 섹션은 클래스의 이름을, 중간 섹션은 클래스의 애트리뷰트를, 가장 아래 섹션은 클래스의 연산(“그림 2에서는 세 개의 섹션으로 클래스 다이어그램을 설명하고 있다. 위 섹션은 클래스의 이름을, 중간 섹션은 클래스의 애트리뷰트를, 가장 아래 섹션은 클래스의 연산 ("메소드")을 보여주고 있다.

그림 2: 클래스 다이어그램의 클래스 객체

내 경험으로는 거의 모든 개발자들은 이 다이어그램이 무엇을 하고 있는지를 안다. 하지만 대부분의 프로그래머들은 관계도를 잘못 그리고 있다. 그림 3과 같은 클래스 다이어그램의 경우 상속 관계^{주 1}를 그릴 때에는 화살표 방향을 위로 향하게 하여 수퍼 클래스를 지시하게 하면서 화살표 모양은 완벽한 삼각형이 되도록 해야 한다. 상관 관계는 두 클래스들이 서로를 인식하고 있다면 일직선이 되어야 하고, 클래스의 한 편만 알고 있는 관계라면 화살표 표시가 되어있는 선을 그어야 한다.

그림 3: 그림 2의 클래스 객체가 포함된 클래스 다이어그램

그림 3에서, 상속 관계와 두 개의 상관 관계를 보았다. CDSalesReport 클래스는 Report 클래스에서 상속을 받고, CDSalesReport는 한 개의 CD와 관련이 되어 있지만, CD 클래스는 CDSalesReport에 대해 아무것도 모르고 있다. CD와 Band 클래스는 서로에 대해 알고 있고, 두 클래스는 서로 연관되어 있다.

클래스 다이어그램에는 이 보다 더 많은 개념들을 적용할 수 있다. 나중에 설명하도록 하겠다.

위로

시퀀스 다이어그램

시퀀스 다이어그램은 특정 사용 케이스에 대한 상세한 흐름이나 심지어는 특정 사용 케이스의 일부분 까지도 보여준다. 대부분이 설명을 포함하고 있다. 시퀀스에서 다른 객체들 간의 호출관계를 보여주고 있고, 다른 객체들로의 다른 호출까지 상세하게 보여줄 수 있다.

시퀀스 다이어그램은 2차원으로 그려진다. 수직 차원은 발생 시간 순서로 메시지/호출 시퀀스를 보여주고 있다. 수평 차원은 메시지가 전송되는 객체 인스턴스를 나타내고 있다.

시퀀스 다이어그램은 그리기가 매우 간단하다. 다이어그램의 상단에 각 클래스 인스턴스를 박스 안에 놓아 클래스 인스턴스(객체)를 구분한다. (그림 4) 박스 안에 클래스 인스턴스 이름과 클래스 이름을 스페이스/콜론/스페이스 " : "로 분리시킨다. (예, myReportGenerator : ReportGenerator) 클래스 인스턴스가 메시지를 또 다른 클래스 인스턴스로 보내면 클래스 인스턴스를 받는 곳을 가리키는 화살표를 긋는다. 그 라인 위에 메시지/메소드 이름을 적는다. 중요한 메시지의 경우는 원래의 클래스 인스턴스를 다시 향하도록 점선 화살표를 그릴 수 있다. 점선 위에 리턴 값을 라벨링한다. 개인적으로는 리턴 값을 포함하곤 하는데 상세한 부분을 읽기 쉽기 때문이다.

시퀀스 다이어그램을 읽기는 매우 간단하다. 시퀀스를 시작하는 "드라이버(driver)" 클래스 인스턴스가 있는 왼쪽 상단 코너부터 시작한다. 그런 다음, 다이어그램 아래쪽을 각 메시지를 따라간다. 그림 4의 시퀀스 다이어그램 예제에서 전송 메시지에 대한 리턴 메시지는 선택적인 것임을 기억하라.

그림 4: 시퀀스 다이어그램

그림 4의 시퀀스 다이어그램을 읽다 보면 CD Sales Report가 어떻게 만들어지는지를 알 수 있다. aServlet 객체가 우리의 드라이버 예제이다. aServlet은 메시지를 gen이라고 하는 ReportGenerator 클래스 인스턴스로 보낸다. 이 메시지는 generateCDSalesReport 라는 라벨링이 붙는다. ReportGenerator 객체가 이 메시지 핸들러를 구현한다는 의미이다. 자세히 보면, generateCDSalesReport 메시지 라벨은 괄호 안에 cdId가 있다. gen 인스턴스가 generateCDSalesReport 메시지를 받으면 CDSalesReport로 연속 호출을 하고 aCDReport 라고 하는 CDSalesReport의 실제 인스턴스가 리턴 된다. gen 인스턴스는 리턴된 aCDReport 인스턴스에 호출하면서 여기에 각 메시지 호출에 대한 매개변수를 전달한다. 시퀀스의 끝에서 gen 인스턴스는 콜러였던 aServlet에 aCDReport를 리턴한다.

그림 4의 시퀀스 다이어그램은 전형적인 시퀀스 다이어그램을 상세히 설명한 것이다. 하지만 충분히 이해할 수 있을 것이다. 또한 초보 개발자들에게는 각 레벨 마다 시퀀스를 끊어서 이해하는 것도 좋은 방법이다.

위로

스테이트 차트 다이어그램

스테이트 차트 다이어그램은 클래스가 개입된 다양한 상태(state)를 모델링 하고 그 클래스가 상태간 어떻게 이동하는지를 모델링 한다. 모든 클래스는 상태를 갖고 있지만 각각의 클래스가 스테이트 차트 다이어그램을 가질 수 없다. "중요한" 상태, 말하자면 시스템 작동 중 세 개 이상의 잠재적 상태가 있는 클래스일 경우만 모델링 되어야 한다.

그림 5에서 보듯, 스테이트챠트 다이어그램에는 다섯 개의 기본 엘리먼트들이 사용된다. 시작점(짙은 원), 스테이트 간 이동(화살표), 스테이트(모서리가 둥근 직사각형), 결정 포인트(속이 비어있는 원), 한 개 이상의 종료점(원 내부에 짙은 원이 그려져 있음)이 바로 그것이다. 스테이트챠트 다이어그램을 그리려면 시작점과 클래스의 초기 상태를 향하는 화살표로 시작한다. 다이어그램 어디에나 이 스테이트를 그릴 수 있고 스테이트 이동 라인을 사용하여 연결한다.

그림 5:시스템 작동 중 클래스가 실행되는 다양한 상태를 보여주는 스테이트 차트 다이어그램

그림 5의 스테이트 차트 다이어그램은 중요한 정보를 보여주고 있다. 예를 들어, 대출 프로세스가 Loan Application 상태에서 출발한다고 말할 수 있다. 결과에 따라 사전 승인 프로세스가 완료되면 Loan Pre-approved 상태나 Loan Rejected 상태로 옮겨간다. 이동하는 동안 내린 결정은 결정 포인트로 보여진다. 이동 라인 상의 비어있는 원이 바로 그것이다. 이 예제를 보면 Loan Closing 상태를 거치지 않고는 대출이 Loan Pre-Approved 상태에서 Loan in Maintenance 상태로 갈 수 없음을 알 수 있다. 또한, 모든 대출이 Loan Rejected 상태 또는 Loan in Maintenance 상태에서 끝난다는 것도 알 수 있다.

위로

액티비티 다이어그램

액티비티 다이어그램은 액티비티를 처리하는 동안 두 개 이상의 클래스 객체들 간 제어 흐름을 보여준다. 액티비티 다이어그램은 비즈니스 단위 레벨에서 상위 레벨의 비즈니스 프로세스를 모델링 하거나 저수준 내부 클래스 액션을 모델링 하는데 사용된다. 내가 경험한 바로는 액티비티 다이어그램은 기업이 현재 어떻게 비즈니스를 수행하는지, 또는 어떤 것이 비즈니스에 어떤 작용을 하는지 등의 고차원 프로세스를 모델링 할 때 가장 적합하다. 액티비티 다이어그램은 언뜻 보기에 시퀀스 다이어그램 보다는 덜 기술적이기 때문에 비즈니스 마인드를 가진 사람들이 빠르게 이해할 수 있다.

액티비티 다이어그램의 표기법은 스테이트 차트 다이어그램과 비슷하다. 스테이트 차트 다이어그램과 마찬가지로 액티비티 다이어그램은 초기 액티비티에 연결된 실선으로 된 원에서 시작한다. 이 액티비티는 모서리가 둥근 직사각형을 그려 그 안에 액티비티 이름을 적어 넣으면서 모델링 된다. 액티비티들은 이동 라인을 통해 다른 액티비티들에 연결되거나 결정 포인트의 조건에 제약을 받는 다른 액티비티들에 연결하는 결정 포인트로 연결될 수 있다. 모델링 된 프로세스를 종료하는 액티비티는 (스테이트 차트 다이어그램에서처럼) 종료 포인트에 연결된다. 이 액티비티들은 수영 레인으로 그룹핑 될 수 있다. 이것은 실제로 액티비티를 수행하는 객체를 나타내는데 사용된다. (그림 6)

그림 6: 두 개의 객체(밴드 매니저와 리포팅 툴)에 의한 액티비티 제어를 나타내는 두 개의 수영 레인으로 되어있다.

액티비티 다이어그램 예제는 두 개의 객체(밴드 매니저와 리포팅 툴)에 의한 액티비티 제어를 나타내는 두 개의 수영 레인으로 되어있다. 프로세스는 한 밴드에 대한 판매 리포트를 보는 밴드 매니저로 시작한다. 리포팅 툴은 사람이 관리하는 모든 밴드들을 검색하여 디스플레이하고 이중 한 개를 고를 것을 요청한다. 밴드 매니저가 한 밴드를 선택하면 리포팅 툴은 판매 정보를 검색하여 판매 리포트를 디스플레이 한다.

위로

컴포넌트 다이어그램

컴포넌트 다이어그램은 시스템을 물리적으로 볼 수 있도록 한다. 이것의 목적은 소프트웨어가 시스템의 다른 소프트웨어 컴포넌트들(예를 들어, 소프트웨어 라이브러리)에 대해 소프트웨어가 갖고 있는 종속 관계를 보여주는 것이다. 이 다이어그램은 매우 고급 레벨에서 볼 수 있거나 컴포넌트 패키지 레벨에서 볼 수 있다. ^{주 2}

컴포넌트 다이어그램의 모델링은 이 예제에 잘 설명되어 있다. 그림 7은 네 개의 컴포넌트인 Reporting Tool, Billboard Service, Servlet 2.2 API, JDBC API를 보여주고 있다. Reporting Tool에서 출발하여 Billboard Service, Servlet 2.2 API, JDBC API로 가는 화살표는 Reporting Tool이 이들 세 개의 컴포넌트에 종속되어 있음을 나타낸다.

그림 7: 컴포넌트 다이어그램은 다양한 소프트웨어 컴포넌트들 간 종속관계를 보여준다.

위로

전개 다이어그램

전개 다이어그램은 하드웨어 환경에 시스템이 물리적으로 어떻게 전개되는지를 보여준다. 목적은 시스템의 다양한 컴포넌트들이 어디에서 실행되고 서로 어떻게 통신하는지를 보여주는 것이다. 다이어그램이 물리적 런타임을 모델링 하기 때문에 시스템 사용자는 이 다이어그램을 신중하게 사용해야 한다.

전개 다이어그램의 표기법에는 컴포넌트 다이어그램에서 사용되던 표기법 요소들이 포함된다. 이외에 노드 개념을 포함하여 두 가지 정도 추가되었다. 노드는 물리적 머신 또는 가상 머신 노드(메인프레임 노드)를 표현한다. 노드를 모델링 하려면 3차원 큐브를 그려 큐브 상단에 노드 이름을 적는다. 시퀀스 다이어그램에서 사용되던 네이밍 규칙([instance name] : [instance type]) (예, "w3reporting.myco.com : Application Server").

그림 8: 전개 다이어그램. Reporting Tool 컴포넌트가 WebSphere 내부에서 그려지기 때문에(w3.reporting.myco.com 노드의 내부에서 그려짐), 사용자들은 로컬 머신에서 실행되는 브라우저를 통해 Reporting Tool에 액세스 하면서 기업 인트라넷을 통해 HTTP에 연결할 수 있다는 것을 알 수 있다.

그림 8의 전개 다이어그램은 사용자가 로컬 머신에서 실행되고 기업의 인트라넷에서 HTTP를 통해 Reporting Tool에 연결되는 브라우저를 사용하여 Reporting Tool에 접근하는 것을 보여주고 있다. 이 툴은 물리적으로 w3reporting.myco.com 이라고 하는 Application Server에서 실행된다. 이 다이어그램은 IBM WebSphere 내부에서 그려진 Reporting Tool 컴포넌트를 보여준다. 이것은 결과적으로 node w3.reporting.myco.com에서 그려지게 되어있다. Reporting Tool은 자바를 사용하여 리포팅 데이터베이스를 IBM DB2의 JDBC 인터페이스에 연결하여 원시 DB2 통신을 사용하는 db1.myco.com 서버상에서 실행되는 실제 DB2 데이터베이스와 통신한다. 리포팅 데이터베이스와 통신하는 것 외에도 Report Tool 컴포넌트는 SOAP over HTTPS를 통해 Billboard Service와 통신한다.

위로

결론

이 글은 Unified Modeling Language에 대한 간단한 입문서에 불과하지만 여러분이 이 정보를 실제 프로젝트에 적용하거나 더 깊게 UML을 연구하기를 바란다. UML 다이어그램을 소프트웨어 개발 프로세스에 통합시키는 여러 소프트웨어 툴이 있지만, 자동화된 툴이 없더라도 화이트보드에 마커와 펜을 사용하여 UML 다이어그램을 그려도 좋다.

위로

주

¹ 상속과 기타 객체 지향 원리에 대한 기타 자세한 정보는 http://java.sun.com/docs/books/tutorial/java/concepts/inheritance.html을 참조하기 바란다.

² 컴포넌트 패키지 레벨은 프로그래밍 언어에 중립적인 방식으로 .NET의 네임스페이스(System.Web.UI)나 자바의 패키지(java.util) 같은 클래스 컨테이너 레벨을 참조하는 것이다.

이 글에는 트랙백을 보낼 수 없습니다

Application_developing/C++ 2007/07/20 13:38

세마포어란 무엇인가

세마포어(Semaphores)를 비록 IPC설비중의 하나로 분류하긴 했지만, 다른 파이프, 메시지큐, FIFO 등과는 좀다르다. 다른 IPC 설비들이 대부분 프로세스간 메시지 전송을 그 목적으로 하는데 반해서 세마포어는 프로세스간 데이타를 동기화 하고 보호하는데 그목적이 있다.

프로세스간 메시지 전송을 하거나, 혹은 공유메모리를 통해서 특정 데이타를 공유하게 될경우 발생하는 문제가, 공유된 자원에 여러개의 프로세스가 동시에 접근을 하면 안되며, 단지 한번에 하나의 프로세스만 접근 가능하도록 만들어줘야 할것이다. 이것은 쓰레드에서 메시지간 동기화를 위해서 mutex 를 사용하는것과 같은 이유이다.

하나의 데이타에 여러개의 프로세스가 관여할때 어떤 문제점이 발생할수 있는지 간단한 예를 들어보도록 하겠다.

int count=100; 
A 프로세스가 count 를 읽어들인다.     100
B 프로세스가 count 를 읽어들인다.     100
B 프로세스가 count 를 1 증가 시킨다.  101 
A 프로세스가 count 를 1 증가 시킨다.  101

count 는 공유자원(공유메모리 같은)이며 A와 B 프로그램이 여기에 대한 작업을 한다. A가 1을 증가 시키고 B가 1을 증가시키므로 최종 count 값은 102 가 되어야 할것이다. 그러나 A 가 작업을 마치기 전에 B가 작업을 하게 됨으로 엉뚱한 결과를 보여주게 되었다. 위의 문제를 해결하기 위해서는 count 에 A가 접근할때 B프로세스가 접근하지못하도록 block 시키고, A가 모든 작업을 마쳤을때 B프로세스가 작업을 할수 있도록 block 를 해제 시키면 될것이다.
우리는 세마포어를 이용해서 이러한 작업을 할수 있다. 한마디로 줄여서 세마포어는 "여러개의 프로세스에 의해서 공유된는 자원의 접근제어를 위한 도구" 이다.

세마포어의 작동원리

작동원리는 매우 간단하다. 차단을 원하는 자원에대해서 세마포어를 생성하면 해당자원을 가리키는 세마포어 값이 할당된다. 이 세마포어 값에는 현재 세마포어를 적용하고 있는 자원에 접근할수 있는 프로세스의 숫자를 나타낸다. 이 값이 0이면 이 자원에 접근할수 있는 프로세스의 숫자가 0이라는 뜻이며, 자원), 0보다 큰 정수면 해당 정수의 크기만큼의 프로세스가 자원에 접근할수 있다라는 뜻이 된다. 그러므로 우리는 접근제어를 해야하는 자원에 접근하기 전에 세마포어 값을 검사해서 값이 0이면 자원을 사용할수 있을때까지 기다리고, 0보다 더크면(1이라고 가정하자) 자원에 접근해서 세마포어 값을 1 감소 시켜서, 세마포어 값을 0으로 만들어서, 다른 프로세스가 자원에 접근할수 없도록 하고, 자원의 사용이 끝나면 세마포어 값을 다시 1증가시켜서 다른 프로세스가 자원을 사용할수 있도록 만들어주면 된다.

만약 세마포어 값을 검사했는데 세마포어 값이 0이라면 사용할수 있게 될때까지 (1이 될때까지) 기다리면 (block) 될것이다.

세마포어의 사용

세마포어의 사용은 위의 작동원리를 그대로 적용한다. 즉 1. 세마포어로 제어할 자원을 설정한다.
2. 해당 자원을 사용하기전에 세마포어 값을 확인한다.
3. 세마포어 값이 0보다 크면 자원을 사용하고, 세마포어 값을 1 감소 시킨다.
4. 세마포어 값이 0이면 값이 0보다 커질때까지 block 되며, 0보다 커지게 되면 2번 부터 시작하게 된다.

위의 작업을 위해서 Unix 는 다음과 같은 관련함수들을 제공한다.

#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>

int semget(key_t key, int nsems, int semflg);
int semop (int semid, struct sembuf *sops, unsigned nsops);
int semctl(int semid, int semnum, int cmd, union semun arg);

세마포어의 관리

세마포어는 그 특성상 원자화된 연산을 필요로 한다. 이러한 원자화된 연산은 유저레벨의 함수에서는 제공하기가 힘들므로, 세마포어 정보는 커널에서 전용 구조체를 이용해서 관리하게 된다. 다음은 커널에서 세모포어 정보를 유지하기 위해서 관리하는 구조체인 semid_ds 구조체의 모습이다. semid_ds 는 /usr/include/bits/sem.h 에 선언되어 있다. (이것은 리눅스에서의 경우로 Unix 버젼에 따라서 위치와 멤버변수에 약간씩 차이가 있을수 있다)

struct semid_ds
{
    struct ipc_perm sem_perm;     
    __time_t sem_otime;           
    unsigned long int __unused1;
    __time_t sem_ctime;           
    unsigned long int __unused2;
    unsigned long int sem_nsems;  
    unsigned long int __unused3;
    unsigned long int __unused4;
};

sem_perm 은 세마포어에 대한 퍼미션으로 일반 파일퍼미션과 마찬가지의 기능을 제공한다. 즉 현재 세마포어 구조체에 접근할수 있는 사용자권한을 설정한다. sem_nsems 는 생성할수 있는 세마포어의 크기이다. sem_otime 은 마지막으로 세마포어관련 작업을 한 시간(semop 함수를 이용)이며, sem_ctim 은 마지막으로 구조체 정보가 바뀐 시간이다.

semget 을 이용해서 세마포어를 만들자.

세마포어의 생성혹은 기존에 만들어져 있는 세마포어에 접근하기 위해서 유닉스에서 는 semget(2)를 제공한다. 첫번째 아규먼트는 세마포어의 유일한 키값을 위해서 사용하는 int 형의 키값이다. 우리는 이 key 값을 이용해서 유일한 세마포어를 생성하거나 접근할수 있게 된다. 새로 생성되거나 기존의 세마포어에 접근하거나 하는것은 semflg 를 통해서 제어할수 있다. 다음은 semflg 의 아규먼트이다.

IPC_CREAT: 만약 커널에 해당 key 값으로 존재하는 세마포어가 없다면, 새로 생성 한다.
IPC_EXCL: IPC_CREAT와 함께 사용하며, 해당 key 값으로 세마포어가 이미 존재한다면 실패값을 리턴한다.

semflg 를 통해서 세마포어에 대한 퍼미션을 지정할수도 있다. 퍼미션 지정은 보통의 파일에 대해서 유저/그룹/other 에 대해서 지정하는것과 같다.

만약 IPC_CREAT 만 사용할경우 해당 key 값으로 존재하는 세마포어가 없다면, 새로 생성하고, 이미 존재한다면 존재하는 세마포어의 id 를 넘겨준다. IPC_EXCL을 사용하면 key 값으로 존재하는 세마포어가 없을경우 새로 생성되고, 이미 존재한다면 존재하는 id 값을 돌려주지 않고 실패값(-1)을 되돌려주고, errno 를 설정한다.

nsems 은 세마포어가 만들어질수 있는 크기이다. 이값은 최초 세마포어를 생성하는 생성자의 경우에 크기가 필요하다(보통 1). 그외에 세마포어에 접근해서 사용하는 소비자의 경우에는 세마포어를 만들지 않고 단지 접근만 할뿐임으로 크기는 0이 된다.

이상의 내용을 정리하면 semget 은 아래와 같이 사용할수 있을것이다.

만약 최초 생성이라면
    sem_num = 1;
그렇지 않고 만들어진 세마포어에 접근하는 것이라면
    sem_num = 0; 
sem_id = semget(12345, sem_num, IPC_CREAT|0660)) == -1)
{
    perror("semget error : ");
    return -1;
}

semget 은 성공할경우 int 형의 세마포어 지사자를 되돌려주며, 모든 세마포어에 대한 접근은 이 세마포어 지시자를 사용하게 된다.

위의 코드는 key 12345 를 이용해서 세마포어를 생성하며 퍼미션은 0660으로 설정된다. 세마포어의 크기는 1로 잡혀 있다(대부분의 경우 1). 만약 기존에 key 12345 로 이미 만들어진 세마포어가 있다면 새로 생성하지 않고 기존의 세마포어에 접근할수 있는 세마포어 지시자를 되돌려주게 되고, 커널은 semget 를 통해 넘어온 정보를 이용해서 semid_ds 구조체를 세팅한다.

예제: semget.c

#include <sys/types.h> 
#include <sys/ipc.h> 
#include <sys/sem.h> 

int main()
{
    int semid;
    semid = semget((key_t)12345, 1, 0666 | IPC_CREAT);
}

이제 위의 코드를 컴파일해서 실행시키고 나서 실제로 세마포어 정보가 어떻게 바뀌였는지 확인해 보도록 하자.

커널에서 관리되는 ipc 정보를 알아보기 위해서는 ipcs(8)라는 도구를 이용하면 된다.

[root@localhost test]# ipcs -s    
------ Semaphore Arrays --------
key        semid      owner      perms      nsems      status      
0x00003039 0          root      666        1

0x00003039 은 key 12345 의 16진수 표현이다. 퍼미션은 666으로 되어 있고 semget 를 통해서 제대로 설정되어 있음을 알수 있다.

세마포어를 이용해서 접근제어 하기

이제 semget 을 통해서 세마포어를 만들거나 접근했으니, 이제 실제로 세마포어상태를 검사해서 접근제어를 해보도록하자.

이것은 semop를 통해서 이루어진다.

semop 의 첫번째 semid 는 semget 을 통해서 얻은 세마포어 지시자이다. 2번째 아규먼트는 struct sembuf 로써, 어떤 연산을 이루어지게 할런지 결정하기 위해서 사용된다. 구조체의 내은 다음과 같으며, sys/sem.h 에 선언되어 있다.

struct sembuf
{
    short sem_num;    // 세마포어의수
    short sem_op;     // 세마포어 연산지정
    short sem_flg;    // 연산옵션(flag)
}

sem_num 멤버는 세마포어의 수로 여러개의 세마포어를 사용하지 않는다면(즉 배열이 아닐경우) 0을 사용한다. 배열의 인덱스 사이즈라고 생각하면 될것이다. 보통의 경우 하나의 세마포어를 지정해서 사용하므로 0 이 될것이다.
sem_op 를 이용해서 실질적으로 세마포어 연산을 하게 되며, 이것을 이용해서 세마포어 값을 증가시키거나 감소 시킬수 잇다. sem_op 값이 양수일 경우는 자원을 다 썼으니, 세마포어 값을 증가시키겠다는 뜻이며, 음수일 경우에는 세마포어를 사용할것을 요청한다라는 뜻이다. 음수일 경우 세마포어값이 충분하다면 세마포어를 사용할수 있으며, 커널은 세마포어의 값을 음수의 크기의 절대값만큼을 세마포어에서 빼준다. 만약 세마포어의 값이 충분하지 않다면 세번째 아규먼트인 sem_flg 에 따라서 행동이 결정되는데,
sem_flg 가 IPC_NOWAIT로 명시되어 있다면, 해당영역에서 기다리지 않고(none block) 바로 에러코드를 리턴한다. 그렇지 않다면 세마포어를 획득할수 있을때까지 block 되게 된다. sem_flg 는 IPC_NOWAIT 와 SEM_UNDO 2개의 설정할수 있는 값을가지고 있다. IPC_NOWAIT 는 none block 모드 지정을 위해서 사용되며, SEM_UNDO 는 프로세스가 세마포어를 돌려주지 않고 종료해버릴경우 커널에서 알아서 세마포어 값을 조정(증가) 할수 있도록 만들어 준다.

설명이 아마 애매모호한면이 있을것이다. 간단한 상황을 예로 들어서 설명해 보겠다.

현재 세마포어 값이 1 이라고 가정하자. 
이때 A 프로세스가 semop 를 통해서 세마포어에 접근을 시도한다. 
A는 접근을 위해서 sem_op 에 -1 을 세팅한다. 즉 세마포어 자원을 1 만큼 사용하겠다라는 
뜻이다.   
현재 준비된 세마포어 값은 1로 즉시 사용할수 있으므로, 
A는 자원을 사용하게 되며, 커널은 세마포어 값을 1 만큼 감소시킨다. 

이때 B 라는 프로세스가 세마포어 자원을 1 만큼 사용하겠다라고 요청을 한다. 
그러나 지금 세마포어 값은 0 이므로 B는 지금당장 세마포어 를 사용할수 없으며, 
기다리거나, 에러값을 리턴 받아야 한다(IPC_NOWAIT).   
B는 자원 사용가능할때까지 기다리기로 결정을 했다.  

잠수후 A는 모든 작업을 다마쳤다. 
이제 세마포어를 되돌려줘야 한다. sem_op 에 1 을 세팅하면, 
커널은 세마포어 값을 1증가시키게 된다. 

드디어 기다리던 B가 세마포어 자원을 사용할수 있는 때가 도래했다. 
이제 세마포어 값은 1이 므로 B는 세마포어를 획득하게 된다.  
커널은 세마포어 값을 1 감소 시킨다.
B는 원하는 작업을 한다.
...
...

세마포어 조작

semctl 이란 함수를 이용해서 우리는 세마포어를 조정할수 있다. semctl 은 semid_ds 구조체를 변경함으로써 세마포어의 특성을 조정한다.

첫번째 아규먼트인 semid 는 세마포어 지시자이다. semnum 은 세마포어 배열을 다룰 경우 사용되며, 보통은 0이다. cmd 는 세마포어 조작명령어 셋으로 다음과 같은 조작명령어들을 가지고 있다. 아래는 그중 중요하다고 생각되는 것들만을 설명하였다. 더 자세한 내용은 semctl 에 대한 man 페이지를 참고하기 바란다.

IPC_STAT: 세마포어 상태값을 얻어오기 위해 사용되며, 상태값은 arg 에 저장된다.
IPC_RMID: 세마포어 를 삭제하기 위해서 사용한다.
IPC_SET: semid_ds 의 ipc_perm 정보를 변경함으로써 세마포어에 대한 권한을 변경한다.

예제를 통한 이해

지금까지 익혔던 내용을 토대로 간단한 예제프로그램을 만들어보겠다. 예제의 상황은 하나의 파일에 2개의 프로세스가 동시에 접근하고자 하는데에서 발생한다. 파일에는 count 숫자가 들어 있으며, 프로세스는 파일을 열어서 count 숫자를 읽어들이고, 여기에 1을 더해서 다시 저장하는 작업을한다. 이것을 세마포어를 통해서 제어하지 않으면 위에서 설명한문제가 발생할것이다.

위의 문제를 해결하기 위해서는 파일을 열기전에 세마포어를 설정해서 한번에 하나의 프로세스만 접근가능하도록 하면 될것이다. 모든 파일작업을 마치게 되면, 세마포어 자원을 돌려줌으로써, 비로서 다른 프로세스가 접근가능하게 만들어야 한다.
예제: sem_test

#include <sys/types.h> 
#include <sys/sem.h> 
#include <sys/ipc.h> 
#include <stdio.h> 
#include <unistd.h> 

#define SEMKEY 2345 

union semun
{
    int val;
    struct semid_ds *buf;
    unsigned short int *array;
};

static int  semid;
int main(int argc, char **argv)
{
    FILE* fp;
    char buf[11];
    char count[11];

    union semun sem_union; 

    // open 과 close 를 위한 sembuf 구조체를 정의한다. 
    struct sembuf mysem_open  = {0, -1, SEM_UNDO}; // 세마포어 얻기
    struct sembuf mysem_close = {0, 1, SEM_UNDO};  // 세마포어 돌려주기
    int sem_num;

    memset(buf, 0x00, 11);
    memset(count, 0x00, 11);

    // 아규먼트가 있으면 생성자
    // 그렇지 않으면 소비자이다.
    if (argc > 1)
        sem_num = 1;
    else 
        sem_num = 0;            

    // 세마포설정을 한다. 
    semid = semget((key_t)234, sem_num, 0660|IPC_CREAT);
    if (semid == -1)
    {
        perror("semget error ");
        exit(0);
    }    

    // counter.txt 파일을 열기 위해서 세마포어검사를한다. 
    if(semop(semid, &mysem_open, 1) == -1)
    {
        perror("semop error ");
        exit(0);
    }

    if ((fp = fopen("counter.txt", "r+")) == NULL)
    {
        perror("fopen error ");
        exit(0);
    }
    // 파일의 내용을 읽은후 파일을 처음으로 되돌린다.  
    fgets(buf, 11, fp);
    rewind(fp);

    // 개행문자를 제거한다. 
    buf[strlen(buf) - 1] = 0x00;

    sprintf(count, "%d\n", atoi(buf) + 1); 
    printf("%s", count);
    // 10초를 잠들고 난후 count 를 파일에 쓴다. 
    sleep(10);
    fputs(count,fp);

    fclose(fp);
    // 모든 작업을 마쳤다면 세마포어 자원을 되될려준다
    semop(semid, &mysem_close, 1);
    return 1;
}

코드는 매우 간단하지만, 세마포어에 대한 기본적인 이해를 충분히 할수 있을만한 코드이다. 생성자와 소비자의 분리는 프로그램에 넘겨지는 아규먼트를 이용했다. 모든 작업을 마치면 테스트를 위해서 10초를 기다린후에 세마포어를 돌려주도록 코딩되어 있다.

우선 count 를 저장할 파일 counter.txt 를 만들고 여기에는 1을 저장해 놓는다. 그다음 ./sem_test 를 실행시키는데, 최초에는 생성자를 만들어야 하므로 아규먼트를 주어서 실행시키고, 그다음에 실행시킬때는 소비자가 되므로 아규먼트 없이 실행하도록 하자. 다음은 테스트 방법이다.

[root@coco test]# ./sem_test 1&
[1] 3473
36
[root@coco test]# ./sem_test

위 코드를 실행해보면 ./sem_test 1 이 세마포어자원을 돌려주기 전까지 ./sem_test 가 해당영역에서(세마포어 요청하는 부분) 블럭되어 있음을 알수 있고, 충돌없이 count가 잘되는것을 볼수 있을것이다.

세마포어는 커널에서 관리하는데 세마포어를 사용하는 프로세스가 없다고 하더라도 semctl 을 이용해서 제거하지 않는한은 커널에 남아있게 된다. 세마포어 정보를 제거하기 위해서는 semctl 연산을 하든지, 컴퓨터를 리붓 시커거나, ipcrm(8)이란 도구를 사용해서 제거시켜 줘야 한다.

결론

세마포어는 fcntl 과 매우 비슷한 일을 수행하는데, 좀더 세밀한 조정이 가능하다라는 장점을 가지고 있지만 간단한 일을 하기엔 지나치게 복잡한 면이 없잖아 있다. 이럴경우에는 세마포어 대신 fcntl 을 사용하자.

이 글에는 트랙백을 보낼 수 없습니다

by웅쓰

웅자의 상상플러스

전체 (379)

게임 (5)

영화 (2)

기타 (23)

맛집 (5)

영어 (2)

대수학 (3)

형태소 (5)

Hacking (9)

Linux (112)

HTML (48)

Application_developing (48)

Web_developing (102)

Window (11)

상상플러스 - 최근 공지

웅자 홈피 계속 업데이트 됩니다 !

pelly.ryu 2016
라고 적고 보니 4번이 눈에 띄었네요. 지우려고 ...
pelly.ryu 2016
저처럼 누군가 또 찾아올 것 같아 남깁니다.. `$c...
와우 2014
깔끔하게 풀어두셔서 많은 도움이 되었습니다 ^^ ...
ghewffsef 2014
잘배우고 갑니다.. 말한마디 한마디 가슴에 와...
지나가던길 2014
고맙습니다. 많은 참고가 되었습니다.

2016/01 (1)
2015/12 (3)
2015/10 (3)
2015/03 (2)
2015/01 (4)

내 미니홈피.

5개로 나눌 때 O 성능 분석

3개로 나눌 때 O 성능 분석

What is a Java Virtual Machine?

The Lifetime of a Java Virtual Machine

The Architecture of the Java Virtual Machine

Data Types

Word Size

The Class Loader Subsystem

Loading, Linking and Initialization

The Bootstrap Class Loader

User-Defined Class Loaders

Name Spaces

The Method Area

Type Information

The Constant Pool

Field Information

Method Information

Class Variables

A Reference to Class ClassLoader

A Reference to Class Class

Method Tables

An Example of Method Area Use

The Heap

Garbage Collection

Object Representation

Array Representation

The Program Counter

The Java Stack

The Stack Frame

Local Variables

Operand Stack

Frame Data

Possible Implementations of the Java Stack

Native Method Stacks

Execution Engine

The Instruction Set

Execution Techniques

Threads

Native Method Interface

The Real Machine

Eternal Math: A Simulation

On the CD-ROM

The Resources Page

세마포어의 작동원리

세마포어의 사용

세마포어의 관리

semget 을 이용해서 세마포어를 만들자.

세마포어를 이용해서 접근제어 하기

세마포어 조작

예제를 통한 이해

결론

상상플러스 - 최근 공지

A Reference to Class `ClassLoader`

A Reference to Class `Class`