[fwd]css绝对定位、相对定位和文档流的那些事

http://www.cnblogs.com/tim-li/archive/2012/07/09/2582618.html

 

前言

接触html、和css时间也不短了,但每次用div+css布局的时候心里还是有点儿虚,有时候干脆就直接用table算了,很多时候用div会出现些不可预料的问题,虽然花费一定时间能够解决,但总不是个事,所以今天特地探索了下css+div的绝对定位和相对定位和文档流的关系。

 

文档流的概念

确切的说应该是文档文档流模型的机制,html的布局机制就是用文档流模型的,即块元素(block)独占一行,内联元素(inline)。不独占一行

 

如块级元素(block)

div1
div2

效果如下

 

ok,那么怎么知道这不是因为没有设置高和宽的样式而出现的情况呢,我们听邓爷爷的话,实践是检验真理的唯一标准

div1
div2

效果如下:额,没坑你吧…

 

又如内联元素(inline)

<img src="Image/close.gif" alt="图片" style=" height:100px; width:100px;"/>
<a href="http:">超链接1</a>
<a href="http:">超链接2</a>

效果如下:可以看到内联元素后跟内联元素不会另起一行

 

我们再试下inline 后加 block

<img src="Image/close.gif" alt="图片" style=" height:100px; width:100px;"/>
<a href="http:">超链接1</a>
<a href="http:">超链接2</a>
div1

效果如下:可以看到div1(block)是另起一行独占的

 

相对定位 position:relative

故名思意,相对定位就应该是相对于一个东西来定位,而这个东西是什么呢?其实这个东西不是什么,就是元素自己本身,用left right top b0ttom进行定位后,元素会根据原来的位置进行移动,但由于position:relative这个属性并没有脱离文档流的,所以元素本身所占的位置会保留。

下面来做个实验,一看你就懂了- –

div1
div2
div3

效果如下:这是没有加入position:relative文档流的默认排法

 

我们给div2加position:relative 并用top:-20px;left:50px进行相对移动

div1
div2
div3

效果如下:额,为了给大家看到效果还有省了ps的劲直接截了ide的图…但我保证在浏览器里他也是这么排的。我们可以看到蓝色边框就是div2原来的位置,黑色边框就是通过position:relative相对于原来的位置左移50px; 上移20px得到的,而且我们看以看到,div3并没有因为div2的上移而上移了,原因就是position:relative这个属性是没有脱离文档流的,所以元素本身所占的位置会保留。

绝对定位 position:absolute

好吧终于有点戏肉了,文档流那复杂的玩意我们先不理,所谓绝对定位,其实也要找个东西来相对来绝对的,而这个东西就是元素的第一个有position,且positon不能为static的祖先元素,是不是有点拗口,换个说法吧,就是这个小鬼(element)的定位可以是他的老爸,他的爷爷,他的太公….(祖先要素)中而且第一个是是有钱的(position:absolute)或者是当官的(position:relative),就是不能是个程序猿(position:static 或者没有设position)的。而且值得注意的是position:absolute这个属性是脱离文档流的,所以重新定位后元素是不会占着原来的位置的

还是跟着程序来吧- –

复制代码
  
红色:太公
绿色:爷爷
黄色:老爸
div1
div2
div3
</div> </div> </div>
复制代码

效果如下:首先给div1 div2 div3三个祖先div 黄色的老爸, 绿色的爷爷, 红色的太公,暂时对他们都不设position属性

 

好吧,现在给老爸div设position:relative(哟!当官的)给div2设position:absolute;left:120px; top:100px;

复制代码
    
红色:太公
绿色:爷爷
黄色:老爸
div1
div2
div3
</div> </div> </div>
复制代码

效果如下:可以从蓝色线看出,div2以黄色(ide的蓝线和黄色混在一起变色了)div为参照距离老爸左边120px 上边100px而且有于position:absolute是脱离文档流的所以div2原来的位置不能保留div3向上填充

 

我们再用他爷爷来试试

复制代码
红色:太公
绿色:爷爷
黄色:老爸
div1
div2
div3
</div> </div> </div>
复制代码

效果如下:还是div2还是 position:absolute;left:120px;top:100px,可以冲蓝色线看出这次是以绿色爷爷为参照物做绝对定位的,而且div2同样脱离了文档流

至于他太公,一把年纪了,我们就放过他吧- –

先到这把,在下愚见,如有错误请及时指出。有空再总结下margin布局和float布局

copyright © Tim demo下载

同时感谢这三篇文章

http://apps.hi.baidu.com/share/detail/2284634

http://hi.baidu.com/lileding/item/ae30c31e43c09bfe86ad4e30

http://wenku.baidu.com/view/477959140b4e767f5acfce32.html

css float and clear

https://css-tricks.com/all-about-floats/

All About Floats

What is “Float”?

Float is a CSS positioning property. To understand its purpose and origin, we can look to print design. In a print layout, images may be set into the page such that text wraps around them as needed. This is commonly and appropriately called “text wrap”. Here is an example of that.

In page layout programs, the boxes that hold the text can be told to honor the text wrap, or to ignore it. Ignoring the text wrap will allow the words to flow right over the image like it wasn’t even there. This is the difference between that image being part of the flow of the page (or not). Web design is very similar.

In web design, page elements with the CSS float property applied to them are just like the images in the print layout where the text flows around them. Floated elements remain a part of the flow of the web page. This is distinctly different than page elements that use absolute positioning. Absolutely positioned page elements are removed from the flow of the webpage, like when the text box in the print layout was told to ignore the page wrap. Absolutely positioned page elements will not affect the position of other elements and other elements will not affect them, whether they touch each other or not.

Setting the float on an element with CSS happens like this:

#sidebar {
  float: right;			
}

There are four valid values for the float property. Left and Right float elements those directions respectively. None (the default) ensures the element will not float and Inherit which will assume the float value from that elements parent element.

#What are floats used for?

Aside from the simple example of wrapping text around images, floats can be used to create entire web layouts.

Floats are also helpful for layout in smaller instances. Take for example this little area of a web page. If we use float for our little avatar image, when that image changes size the text in the box will reflow to accommodate:

This same layout could be accomplished using relative positioning on container and absolute positioning on the avatar as well. In doing it this way, the text would be unaffected by the avatar and not be able to reflow on a size change.

#Clearing the Float

Float’s sister property is clear. An element that has the clear property set on it will not move up adjacent to the float like the float desires, but will move itself down past the float. Again an illustration probably does more good than words do.

In the above example, the sidebar is floated to the right and is shorter than the main content area. The footer then is required to jump up into that available space as is required by the float. To fix this problem, the footer can be cleared to ensure it stays beneath both floated columns.

#footer {
  clear: both;			
}

Clear has four valid values as well. Both is most commonly used, which clears floats coming from either direction. Left and Right can be used to only clear the float from one direction respectively. None is the default, which is typically unnecessary unless removing a clear value from a cascade. Inherit would be the fifth, but is strangely not supported in Internet Explorer. Clearing only the left or right float, while less commonly seen in the wild, definitely has its uses.

#The Great Collapse

One of the more bewildering things about working with floats is how they can affect the element that contains them (their “parent” element). If this parent element contained nothing but floated elements, the height of it would literally collapse to nothing. This isn’t always obvious if the parent doesn’t contain any visually noticeable background, but it is important to be aware of.

As anti-intuitive as collapsing seems to be, the alternative is worse. Consider this scenario:

If the block element on top were to have automatically expanded to accommodate the floated element, we would have an unnatural spacing break in the flow of text between paragraphs, with no practical way of fixing it. If this were the case, us designers would be complaining much harder about this behavior than we do about collapsing.

Collapsing almost always needs to be dealt with to prevent strange layout and cross-browser problems. We fix it by clearing the float after the floated elements in the container but before the close of the container.

#Techniques for Clearing Floats

If you are in a situation where you always know what the succeeding element is going to be, you can apply the clear: both; value to that element and go about your business. This is ideal as it requires no fancy hacks and no additional elements making it perfectly semantic. Of course things don’t typically work out that way and we need to have more float-clearing tools in our toolbox.

  • The Empty Div Method is, quite literally, an empty div.

    . Sometimes you’ll see a <br> element or some other random element used, but div is the most common because it has no browser default styling, doesn’t have any special function, and is unlikely to be generically styled with CSS. This method is scorned by semantic purists since its presence has no contextual meaning at all to the page and is there purely for presentation. Of course in the strictest sense they are right, but it gets the job done right and doesn’t hurt anybody.

  • The Overflow Method relies on setting the overflow CSS property on a parent element. If this property is set to auto or hidden on the parent element, the parent will expand to contain the floats, effectively clearing it for succeeding elements. This method can be beautifully semantic as it may not require an additional elements. However if you find yourself adding a new div just to apply this, it is equally as non-semantic as the empty div method and less adaptable. Also bear in mind that the overflow property isn’t specifically for clearing floats. Be careful not to hide content or trigger unwanted scrollbars.
  • The Easy Clearing Method uses a clever CSS pseudo selector (:after) to clear floats. Rather than setting the overflow on the parent, you apply an additional class like “clearfix” to it. Then apply this CSS:
    .clearfix:after { 
       content: "."; 
       visibility: hidden; 
       display: block; 
       height: 0; 
       clear: both;
    }

    This will apply a small bit of content, hidden from view, after the parent element which clears the float. This isn’t quite the whole story, as additional code needs to be used to accomodate for older browsers.

Different scenarios call for different float clearing methods. Take for example a grid of blocks, each of different types.

To better visually connect the similar blocks, we want to start a new row as we please, in this case when the color changes. We could use either the overflow or easy clearing method if each of the color groups had a parent element. Or, we use the empty div method in between each group. Three wrapping divs that didn’t previously exist or three after divs that didn’t previously exist. I’ll let you decide which is better.

#Problems with Floats

Floats often get beat on for being fragile. The majority of this fragility comes from IE 6 and the slew of float-related bugs it has. As more and more designers are dropping support for IE 6, you may not care, but for the folks that do care here is a quick rundown.

  • Pushdown is a symptom of an element inside a floated item being wider than the float itself (typically an image). Most browsers will render the image outside the float, but not have the part sticking out affect other layout. IE will expand the float to contain the image, often drastically affecting layout. A common example is an image sticking out of the main content push the sidebar down below.

    Quick fix: Make sure you don’t have any images that do this, use overflow: hidden to cut off excess.

  • Double Margin Bug – Another thing to remember when dealing with IE 6 is that if you apply a margin in the same direction as the float, it will double the margin. Quick fix: set display: inline on the float, and don’t worry it will remain a block-level element.
  • The 3px Jog is when text that is up next to a floated element is mysteriously kicked away by 3px like a weird forcefield around the float. Quick fix: set a width or height on the affected text.
  • In IE 7, the Bottom Margin Bug is when if a floated parent has floated children inside it, bottom margin on those children is ignored by the parent. Quick fix: using bottom padding on the parent instead.

#Alternatives

If you need text wrapping around images, there really aren’t any alternatives for float. Speaking of which, check out this rather clever technique for wrapping text around irregular shapes. But for page layout, there definitely are choices. Eric Sol right here on A List Apart has an article on Faux Absolute Positioning, which is a very interesting technique that in many ways combines the flexibility of floats with the strength of absolute positioning. CSS3 has the Template Layout Module that, when widely adopted, will prove to be the page layout technique of choice.

#Video

I did a screencast a while back explaining many of these float concepts.

 

 

 

https://perishablepress.com/lessons-learned-concerning-the-clearfix-css-hack/

http://complexspiral.com/publications/containing-floats/

http://www.positioniseverything.net/easyclearing.html

div vs table and marking up properly and semantically

http://softwareengineering.stackexchange.com/questions/277778/why-are-people-making-tables-with-divs

http://softwareengineering.stackexchange.com/questions/164988/why-would-one-bother-marking-up-properly-and-semantically

http://softwareengineering.stackexchange.com/questions/165618/why-is-semantic-markup-given-more-weight-for-search-engines?lq=1

https://www.smashingmagazine.com/2009/04/from-table-hell-to-div-hell/

std::bind and lambda functions

http://www.randomprogramming.com/2014/05/stdbind-and-lambda-functions-1/

http://www.randomprogramming.com/2014/06/stdbind-and-lambda-functions-2/

http://www.randomprogramming.com/2014/06/stdbind-and-lambda-functions-3/

http://www.randomprogramming.com/2014/06/stdbind-and-lambda-functions-4/

http://www.randomprogramming.com/2014/06/stdbind-and-lambda-functions-5/

http://www.randomprogramming.com/2014/07/stdbind-and-lambda-functions-6/

OpenGL Multi-Context ‘Fact’ Sheet

OpenGL Multi-Context ‘Fact’ Sheet

 

Recently I have been interested in working with multiple windows in OpenGL. In most cases this also means working with multiple OpenGL contexts and, on occasion, threads. This “Fact Sheet” is meant to be a summary of my research on this topic to help me keep it all straight (it’s a complex topic).

General

  • You can change which OS window a GL context is bound to, note however this will first involve unbinding it from the window it was originally attached too. This assumes that both windows are using the same pixel format. This isn’t much of an issue these days as modern OSs have pretty much standardized on 24bit RGB for their pixel format, you aren’t likely to come across a system that mixes them anymore.[1]
  • There are hardware/vendor specific OpenGL extensions which allow you to specify which card to use for a given GL context and, in some cases, allow movement between cards. These calls are often limited to specific high end cards (e.g. the NVIDIA Quadro cards).[2][3]

Threading

  • Each application thread can only have one OpenGL context bound at any given time. You can change which GL Context is active for a thread at any time using an OS specific call.[4]
  • In most cases dividing OpenGL rendering across multiple threads will not result in any performance improvement (due to the pipeline nature of OpenGL). You can however achieve significant performance improvements by using a second thread for data streaming (see the Performance section below).

Sharing

It is possible to Share information between different OpenGL contexts, subject to some restrictions.[5]

  • Sharing can only occur in the same OpenGL implementation, i.e. you cannot share between a software render, an ATI Hardware Render (ATI Card) and an NVIDIA hardware render (ATI Card), etc. Each of these uses different code to implement OpenGL (i.e. different OpenGL.dll files, or whatever your platform of choice equivalent is) even if they are the same version of OpenGL, they are still separate and can’t share.
  • You can share data between different OpenGL Contexts, even if these contexts are bound to different GPUs (once again assuming they use the same OpenGL implementation/Drivers). Some things to note:
    • This is done using OS Specific extensions, on windows you use the wglShareLists() function, Mac OS X and Linux X11 have their own methods as well. The wglShareLists() function shares Data between the context(s) you specify.
    • If you have multiple separate cards then the data gets copied once to each card as they each have different address spaces and it is the only way to “share” the data. This can significantly slow down sending data to the GPUs.
  • There is no “primary” Context. By that I mean that no individual context controls/owns the data being shared, ownership is shard along with the data. For example, ordinarily when you close an OpenGL context it will clean up after itself. However when sharing between contexts the shared data will only be cleaned up after all contexts that use it are destroyed. So you can do something like:  create context 1, load a texture with it, create context two setting it to share with context one, delete context one, use the texture in context two.
  • In general all ‘Data’ objects are shared between OpenGL contexts, including but not limited to:
    • Vertex Buffer Objects (VBOs), i.e. vertex data
    • Index Buffer Objects (VAOs), i.e. Indices
    • Shader Programs* (see below)
    • Textures
    • Pixel Buffer Objects (PBOs)
    • Render Buffers
    • Sampler Objects *One thing to note with shader programs is that their state is context independent. This means that any uniforms you bind for that shader remain bound even after switching contexts, e.g.
1
2
3
4
5
6
7
Bind Context 1 Bind Shader 1     // now available for use in context 1
Bind Model Matrix 1              // bound to shader 1
Bind Texture 1                   // bound to shader 1
Bind Context 2
Bind Shader 1                    // now available for use in context 1,
// note that it still has Model
// Matrix 1 and Texture 1 bound to it!!
  • Depending on how you configure your render system this may cause issues. Also note that the currently bound Texture is just another uniform attached to the shader.
  • In contrast ‘State’ objects are not shared between contexts, including but not limited to:
    • Vertex Array Objects (VAOs)
    • Framebuffer Objects (FBOs)
  • In my reading I have seen it recommended that you create all your contexts and setup sharing before you start sending data to the GPU(s). This is what I’ve done to date in my testing, so I have no idea what happens if you don’t do this.

Performance

Assuming that you are rendering the same scene, from the same view-point, with the same options, to the same sized buffer (i.e. same resolution) on the same GPU and you have set up your render pipeline to completely render the scene for each context in a serial fashion to reduce context switching, then the performance penalty for multi-context rendering can be calculated using the follow rule of thumb:

Multi Context Performance = Single context Performance / Number of Contexts – context switching overhead

If any of the above assumptions are not true then f**k only knows what the performance impact is going to be. You’ll need to do some benchmarking to find out. The point is that each extra render context has a major impact of the applications performance.

It is worth noting that most OpenGL implementations and graphics drivers are single threaded (Mac OSX seems to be an exception as it seems that there is an optional multithreaded implementation you can use). This means that while it is possible and, on most modern implementations, safe to be rendering using multiple contexts at the same time, there is very little performance improvement. In fact there is most likely a performance penalty to doing this. Why? Because the OpenGL driver queues up the commands in a pipeline as it receives them, when sending it render commands from multiple contexts in an asynchronous fashion you end up with a queue like this:

1
2
3
4
5
6
7
glUseProgram() // context 1
glActiveTexture () // context 1
glDrawElements() // context 2
glBindTexture () // context 1
glDrawElements() // context 2
glUseProgram () // context 2
glDrawElements() // context 1

As you can see, each context/thread is happily doing its own thing, however as the driver executes each command it needs to change to the appropriate context, in the case above that means 4 different contexts switches for just 7 commands. Remember that graphics cards do not like switching contexts. This is why it is in most cases better to just render to each context sequentially once per frame, as this would minimize the context switching and result in more predictable/measurable performance.

I’ve tried rending using multiple contexts on multiple threads (based on this). Without synchronising the threads the demo went from running at ~2300 frames-per-second when running on a single thread to ~60 frames-per-second when multi-threaded, the context NVIDIA drivers to crash in some instances due to race conditions, so not only is it a lot slower, it is also unstable.

When synchronising the threads so that only one of them can render at any given time I experienced a more modest performance reduction of 18%. By synchronising the threads most (if not all) of the extra cost associated with the OpenGL context switching is removed, leaving only the additional overhead of syncing the threads.

Given the above performance it is not recommended that you do not try rendering using multiple OpenGL contexts. A better use of a second OpenGL context is for data streaming, i.e. moving data too/from the GPU. In one case I have seen the use of a PBO + second context/thread achieve up to a 300% performance improvement in rendering 1080p video when compared to a single context/thread without using a PBO to manage the texture data.

Demo Project

I have uploaded a small demo project to Github, it is based on my GLFW3 tutorial (as discussed above).If you whish to play with the demo I suggest starting on line 75 of ThreadingDemo.cpp (through to about line 90). buy toggleing the g_bDoWork variable you can control wether or not “work” is simulated. Below this you’ll find three loops, the first is a single threaded loop (both windows render on the same thread). The second is the bad/unstable loop (I don’t recommend running this). and the third is the good multi-threaded loop (and is ‘on’ by default). I’ve simply commented out all but one loop, change which one to see the different results.

BTW I would not recommend using this as a “best practice” example as to how to render from multiple threads.

Further Reading

(Or some other interesting information on this topic)

Info on Context Destruction in an OO setting (has implications for multi Context environments too)
An article on using multi GPU and OpenGL Contexts to render to multiple monitors
A FAQ on parallel OpenGL Programming
OpenGL and Multithreading
A tutorial on using a second OpenGL context for texture streaming.
The story of multi-monitor rendering (Mainly about DirectX but there’s some interesting stuff on OpenGL here too, also i think the Win7 bug discussed is still in Win8).
OpenGL Insights (Great book on OpenGL, Chapter 28 is free online and very relevant to Multi-contex OpenGL).

References

1. http://www.w3chools.com/browsers/browsers_display.asp
2. http://www.opengl.org/registry/specs/NV/gpu_affinity.txt
3. http://www.opengl.org/registry/specs/AMD/wgl_gpu_association.txt
4. http://www.opengl.org/wiki/OpenGL_and_multithreading
5. http://www.opengl.org/registry/doc/glspec44.core.pdf (Chapter 5, p. 47)

Let’s talk about eglMakeCurrent, eglSwapBuffers, glFlush, glFinish

https://katatunix.wordpress.com/2014/09/17/lets-talk-about-eglmakecurrent-eglswapbuffers-glflush-glfinish/

I have been interested in OpenGL ES 2.0 for years and today I want to share some of my experience.

eglMakeCurrent

First of all, do you remember the function eglMakeCurrent? Its prototype is:

1
2
3
4
5
EGLBoolean eglMakeCurrent(
    EGLDisplay display,
    EGLSurface draw,
    EGLSurface read,
    EGLContext context);

This prototype would force you to understand what are Display, Surface, and Context.

Display

As you have seen, most EGL functions require a Display as the first parameter. A Display is just a connection to the native windowing system running on your computer, like the X Window System on GNU/Linux systems, or Microsoft Windows.

Therefore, before you can use EGL functions, you must create and initialize a Display connection. This is easily done in a two-call sequence, as shown below:

1
2
3
4
5
6
7
8
9
10
EGLint majorVersion;
EGLint minorVersion;
EGLDisplay display; // EGLDisplay is just a void* type
display = eglGetDisplay(EGL_DEFAULT_DISPLAY);
if (display == EGL_NO_DISPLAY) {
    // Unable to open connection to local windowing system
}
if (!eglInitialize(display, &majorVersion, &minorVersion)) {
    // Unable to initialize EGL. Handle and recover
}

You usually (always) pass EGL_DEFAULT_DISPLAY to the function eglGetDisplay so EGL will select the default Display automatically.

Context

Context is nothing but a container that contains two things:

  • Internal state machine (view port, depth range, clear color, textures, VBO, FBO …).
  • A command buffer to hold GL commands that have been called in this context.

In general, Context‘s purpose is storing input data of rendering.

Surface

You can guess that Surface‘s purpose is storing output of rendering. Indeed, a Surface extends a native window or pixmap with additional auxiliary buffers. These buffers include a color buffer, a depth buffer, and a stencil buffer.

Back to eglMakeCurrent

So what does the function eglMakeCurrent do? If you are wondering: when I call a GL command (e.g. glDrawElements), which context and which surface is affected by that GL command; then the answer is eglMakeCurrent:

1
2
3
4
5
EGLBoolean eglMakeCurrent(
    EGLDisplay display,
    EGLSurface draw,
    EGLSurface read,
    EGLContext context);

eglMakeCurrent binds context to the current rendering thread AND TO the draw and read surfaces. draw is used for all GL operations except for any pixel data read back (glReadPixels, glCopyTexImage2D, and glCopyTexSubImage2D), which is taken from the frame buffer values of read.

Therefore, when you call the GL command on a thread T, OpenGL ES will detect which context C was bound to T, and which surface S[draw] and S[read] were bound to C.

Multiple contexts

Sometimes, you want to create and use more than one context. Here is the rules for that case:

  • Binding the same context in 2 threads is NEVER allowed.
  • Binding 2 different contexts in 2 different threads to the same surface is NEVER allowed.
  • Binding 2 different contexts in 2 different threads to 2 different surfaces MAY be allowed, but MAY fail (depending on the GPU implementation you are using).

Shared contexts

Shared contexts are useful in the loading phase of video games. Because of the heavy of uploading data (especially textures) to GPU; if you want your game’s frame rate to be stable, you should run the uploading on another thread. Due to the three rules above, you must create a secondary context that uses the same internal state machine with the primary context. These primary and secondary contexts are called shared contexts.

Please note that, these two contexts share internal state machine only, they DO NOT share their command buffers.

To create the secondary context, you call:

1
2
3
4
5
EGLContext eglCreateContext(
    EGLDisplay display,
    EGLConfig config,
    EGLContext share_context,
    EGLint const * attrib_list);

The 3rd parameter share_context is important here, it is the primary context.

In the secondary thread, you should not draw anything, just upload data to GPU. Hence the surface that you use for this context should be a pixel buffer surface:

1
2
3
4
EGLSurface eglCreatePbufferSurface(
    EGLDisplay display,
    EGLConfig config,
    EGLint const * attrib_list);

eglSwapBuffers

1
2
3
EGLBoolean eglSwapBuffers(
    EGLDisplay display,
    EGLSurface surface);

At the first time I learned this function, I was thinking that its purpose is swapping the display and the surface :)) very silly.

Actually, the only thing that you need to focus here is the surface. If the surface is a pixel buffer surface, then nothing to do, the function returns without any error.

If the surface is a double-buffer surface (you often use this), the function will swap the front-buffer and the back-buffer inside the surface. The back-buffer is used to store output of rendering, while front-buffer is used by the native window to show color on your monitors.

glFlush and glFinish

The OpenGL ES driver and GPU run in a parallel/asynchronous fashion. When you call a GL command, of course, the driver will try to send the command to GPU as soon as possible for best performance. However, the command will not be immediately executed by GPU, it is just added to a queue inside GPU. Hence, if you ask the driver to send too many GL commands in a short time, the GPU queue might be full so the driver has to keep those commands in the command buffer of the current context. (Do you remember the command buffer of a context?)

That when these pending commands will be sent to the GPU is a question. In general, most OpenGL ES driver implementation may send these commands when there’s a new (next) GL command being called. But if you need the sending to be explicitly happened, please call glFlush, this will block current thread until all commands to be sent to the GPU. glFinish is more powerful, it will block current thread until all commands to be sent to GPU and completely executed by GPU. Be careful as your application’s performance will be declined.

glFlush and glFinish are called explicit synchronization. Sometimes, implicit synchronization might be happened. That is when the eglSwapBuffers command is called. Because this command is directly executed by the driver, there’s a chance so that the GPU will draw pending glDraw* commands onto an unexpected surface buffer, since the front-buffer and back-buffer were swapped before. So, yes, as you are thinking, before swapping, the driver must block current thread to wait for all pending glDraw* commands that affect the current surface being finished.

Of course, with double-buffer surfaces, you never need to call glFlush or glFinish because eglSwapBuffers performs an implicit synchronization. But with single-buffer surfaces (i.e. in the secondary thread above), you MUST call glFlush on-time. For example, before the thread is exited, a call to glFlush is a MUST; otherwise, your GL commands might be NEVER sent to GPU.

Okay, now I can call glFinish 🙂

[fwd]Inside ELF Symbol Tables

https://blogs.oracle.com/ali/entry/inside_elf_symbol_tables

Inside ELF Symbol Tables

ELF files are full of things we need to keep track of for later access: Names, addresses, sizes, and intended purpose. Without this information, an ELF file would not be very useful. We would have no way to make sense of the impenetrable mass of octal or hexidecimal numbers.Consider: When you write a program in any language above direct machine code, you give symbolic names to functions and data. The compiler turns these things into code. At the machine level, they are known only by their address (offset within the file) and their size. There are no names in this machine code. How then, can a linker combine multiple object files, or a symbolic debugger know what name to use for a given address? How do we make sense of these files?

Symbols are the way we manage this information. Compilers generate symbol information along with code. Linkers manipulate symbols, reading them in, matching them up, and writing them out. Almost everything a linker does is driven by symbols. Finally, debuggers use them to figure out what they are looking at and to provide you with a human readable view of that information.

It is therefore a rare ELF file that doesn’t have a symbol table. However, most programmers have only an abstract knowledge that symbol tables exist, and that they loosely correspond to their functions and data, and some “other stuff”. Protected by the abstractions of compiler, linker, and debugger, we don’t usually need to know too much about the details of how a symbol table is organized. I’ve recently completed a project that required me to learn about symbol tables in great detail. Today, I’m going to write about the symbol tables used by the linker.

.symtab and .dynsym

Sharable objects and dynamic executables usually have 2 distinct symbol tables, one named “.symtab”, and the other “.dynsym”. (To make this easier to read, I am going to refer to these without the quotes or leading dot from here on.)The dynsym is a smaller version of the symtab that only contains global symbols. The information found in the dynsym is therefore also found in the symtab, while the reverse is not necessarily true. You are almost certainly wondering why we complicate the world with two symbol tables. Won’t one table do? Yes, it would, but at the cost of using more memory than necessary in the running process.

To understand how this works, we need to understand the difference between allocable and a non-allocable ELF sections. ELF files contain some sections (e.g. code and data) needed at runtime by the process that uses them. These sections are marked as being allocable. There are many other sections that are needed by linkers, debuggers, and other such tools, but which are not needed by the running program. These are said to be non-allocable. When a linker builds an ELF file, it gathers all of the allocable sections together in one part of the file, and all of the non-allocable sections are placed elsewhere. When the operating system loads the resulting file, only the allocable part is mapped into memory. The non-allocable part remains in the file, but is not visible in memory. strip(1) can be used to remove certain non-allocable sections from a file. This reduces file size by throwing away information. The program is still runnable, but debuggers may be hampered in their ability to tell you what the program is doing.

The full symbol table contains a large amount of data needed to link or debug our files, but not needed at runtime. In fact, in the days before sharable libraries and dynamic linking, none of it was needed at runtime. There was a single, non-allocable symbol table (reasonably named “symtab”). When dynamic linking was added to the system, the original designers faced a choice: Make the symtab allocable, or provide a second smaller allocable copy. The symbols needed at runtime are a small subset of the total, so a second symbol table saves virtual memory in the running process. This is an important consideration. Hence, a second symbol table was invented for dynamic linking, and consequently named “dynsym”.

And so, we have two symbol tables. The symtab contains everything, but it is non-allocable, can be stripped, and has no runtime cost. The dynsym is allocable, and contains the symbols needed to support runtime operation. This division has served us well over the years.

Types Of Symbols

Given how long symbols have been around, there are surprisingly few types:

STT_NOTYPE
Used when we don’t know what a symbol is, or to indicate the absence of a symbol.
STT_OBJECT / STT_COMMON
These are both used to represent data. (The word OBJECT in this context should not interpreted as having anything to do with object orientation. STT_DATA might have been a better name.)STT_OBJECT is used for normal variable definitions, while STT_COMMON is used for tentativedefinitions. See my earlier blog entry about tentative symbols for more information on the differences between them.

STT_FUNC
A function, or other executable code.
STT_SECTION
When I first started learning about ELF, and someone would say something about “section symbols”, I thought they meant a symbol from some given section. That’s not it though: A section symbol is a symbol that is used to refer to the section itself. They are used mainly when performing relocations, which are often specified in the form of “modify the value at offset XXX relative to the start of section YYY”.
STT_FILE
The name of a file, either of an input file used to construct the ELF file, or of the ELF file itself.
STT_TLS
A third type of data symbol, used for thread local data. A thread local variable is a variable that is unique to each thread. For instance, if I declare the variable “foo” to be thread local, then every thread has a separate foo variable of their own, and they do not see or share values from the other threads. Thread local variables are created for each thread when the thread is created. As such, their number (one per thread) and addresses (depends on when the thread is created, and how many threads there are) are unknown until runtime. An ELF file cannot contain an address for them. Instead, a STT_TLS symbol is used. The value of a STT_TLS symbol is an offset, which is used to calculate a TLS offset relative to the thread pointer. You can read more about TLS in the Linker And Libraries Guide.
STT_REGISTER
The Sparc architecture has a concept known as a “register symbol”. These symbols are used to validate symbol/register usage, and can also be used to initialize global registers. Other architectures don’t use these.

In addition to symbol type, each symbols has other attributes:

  • Name (Optional: Not all symbols need a name, though most do)
  • Value
  • Size
  • Binding and Visibility
  • ELF Section it references

The exact meaning for some of these attributes depends on the type of symbol involved. For more details, consult the Solaris Linker and Libraries Guide, which is available in PDF form online.

Symbols Table Layout And Conventions

The symbols in a symbol table are written in the following order:

  1. Index 0 in any symbol table is used to represent undefined symbols. As such, the first entry in a symbol table (index 0) is always completely zeroed (type STT_NOTYPE), and is not used.
  2. If the file contains any local symbols, the second entry (index 1) the symbol table will be a STT_FILE symbol giving the name of the file.
  3. Section symbols.
  4. Register symbols.
  5. Global symbols that have been reduced to local scope via a mapfile.
  6. For each input file that supplies local symbols, a STT_FILE symbol giving the name of the input file is put in the symbol table, followed by the symbols in question.
  7. The global symbols immediately follow the local symbols in the symbol table. Local and global symbols are always kept separate in this manner, and cannot be mixed together.

What would happen if we ignored these rules and reordered things in some other way (e.g. sorted by address)? There is no way to answer this question with 100% certainty. It would probably confuse existing tools that manipulate ELF files. In particular, it seems clear that the local and global symbols must remain separate. For years and years, arbitrary software has been free to assume the above layout. We can’t possibly know how much software has been written, or how dependent on layout it is. The only safe move is to maintain the well known layout described above.

Next Time: Augmenting The Dynsym

One of the big advantages of Solaris relative to other operating systems is the extensive support for observability: The ability to easily look inside a running program and see what it is doing, in detail. To do that well requires symbols. The symbols in the dynsym may not be enough to do a really good job. For example, to produce a stack trace, we need to take each function address and match it up to its name. If we are looking at a stripped file, or referencing the file from within the process using it via dladdr(3C), we won’t have any way to find names for the non-global functions, and will have to resort to displaying hex addresses. This is better than nothing, but not by much. The standard files in a Solaris distribution are not stripped for exactly this reason. However, many files found in production are stripped, and in-process inspection is still limited to the dynsym.Machines are much larger than they used to be. The memory saved by the symtab/dynsym division is still a good thing, but there are times when we wish that the dynsym contained a bit more data. This is harder than it sounds. The layout of dynsym interacts with the rest of an ELF file in ways that are set in stone by years of existing practice. Backward compatibility is a critical feature of Solaris. We try extremely hard to keep those old programs running. And yet, the needs of observability, spearheaded by important new features like DTrace, put pressure on us in the other direction.

This discussion is prelude to work I recently did to augment the dynsym to contain local symbols, while preserving full backward compatibility with older versions Solaris. I plan to cover that in a future blog entry. ELF is old, and much of how it works cannot be changed. Its original designers (our “Founding Fathers”, as Rod calls them) anticipated that this would be the case, based no doubt on hard experience with earlier systems. The ELF design is therefore uniquely flexible, which explains why it has survived as long as it has. There is always a way to add something new. Sometimes, it takes several tries to find the best way.

Merging Font with FontForge

Merging Font with FontForge

前些时在cauchy同学的帮助下,解决了长期以来困扰我的字体合并问题。这次把相关的东西记录下来备忘。

问题:将一种字体A(一般是一种英文字体)嵌入到另外一种字体B中(一般是一种中文字体),替换掉B字体中对应的字型。为什么要干这种事情呢?很简单,因为大部分中文字体内置的ASCII字型都太难看或者乏味,而英文字体种类远远多与中文字体,你经常可以看到非常漂亮的英文字体,会希望它可以和你喜欢的宋体、或者圆体、或者黑体显示在一起。TeX提供复杂的功能来实现不同语种文字的字体对应,但是最简单最通用的办法还是自己合成一个字体,同时包含你需要的英文字体A和中文字体B。
p.s. 我合成的字体主要用在WoW以及TeX排版中。

要求:除了保持字体不失真、不丢失字符等基本要求之外,还要做到:
1. 操作方法应该是可以方便的跨平台的,因为我大部分时候用Mac,有时用Windows和Linux;
2. 产生的字体应该具备最大限度的跨平台兼容性,支持Mac、Linux、Windows操作系统,可用于一般应用程序和TeX排版引擎。

这个问题并不容易解决,真是不试不知道,昂贵的工具如FontLab Studio居然有5千字符的限制(唉。。。大概只为拉丁语系服务吧),便宜一点的FontCreator只能手工选择字符复制和粘贴,麻烦不说,而且它生成的TrueType字体缺少某些描述符,不能在Mac下被Font Book正常显示。所以才有了下面这个解决方案,let’s go!

解决方案FontForge
这个开放、强大的工具真是无所不能啊,但是要当心,不要被它那个丑陋、缓慢的GUI前端迷惑了,要想做到合并字体这类的事情,还是它的命令行+Python脚本好用——是的,FontForge支持Python脚本,并且提供了数十个函数来帮助你在脚本中使用它的字体处理功能。
然后按照下面的操作步骤进行:

  1. 安装好FontForge,这个根据你的OS可能需要特定的处理,这里就不细说了:
    $ fontforge -version
    Copyright (c) 2000-2008 by George Williams.
     Executable based on sources from 16:34 GMT 30-Mar-2008.
     Library based on sources from 15:57 GMT 30-Mar-2008.
    fontforge 20080330
    libfontforge 20080330
  2. 准备一个临时目录,把你要合并的英文字体A、中文字体B放到目录下;
  3. 准备好下面这个脚本,将其命名为font-merge.pe,保存在上述临时目录下:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    
    # Pre-operation for some non-standard Chinese font file
    Open("font_b.ttf")
    SelectAll()
    ScaleToEm(1024)
    Generate("temp.ttf", "", 0x14)
    Close()
     
    # Open English font and merge to the Chinese font
    Open("font_a.ttf")
    SelectAll()
    ScaleToEm(1024)
     
    MergeFonts("temp.ttf")
    SetFontNames("FontName", "Font Family", "Full Name", "Style", "")
    Generate("font_merged.ttf", "", 0x14)
    Close()

    这里注意几点:

    • 第2行:这里括号里指定中文字体的文件名,和你放到临时目录中的保持一致;
    • 第9行:这里括号里指定英文字体的文件名,和你放到临时目录中的保持一致;
    • 第14行:这里设置输出字体的属性,第一个参数是字体的PostScript名,必须英文,不可以有空格,这也是跨平台唯一的标识;第二个参数是字体的family名,第三个是字体的full名,这两个都可以带空格;第四个参数是样式描述,可以是”Regular”、”Bold”、”Italic”等;
    • 第15行:这里括号里指定输出字体的文件名,和临时目录中已有文件不要重名。

    具体内容参考FontForge的脚本函数说明都很容易理解。

  4. 在上述临时目录下运行下面的命令:
    $ fontforge -script font-merge.pe
  5. 等待命令运行完成,就可以去临时目录取合成好的字体了。这里有时会有一些warning,但是不要紧;某些中文字体会导致出错,比如著名的微软雅黑,由于这个字体内部做了很多hacking所以很多标准工具都处理不了,这类问题暂时没有办法,反正好用的中文字体还有几种,它不行就换别的好了。另外,这里举例用的都是TrueType字体,事实上OpenType也是支持的,其他比较少见的类型,只要FontForge支持就可以用在这里。

That’s it. 希望对于有类似需要的朋友有帮助。